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Introduction 

With  the  advancement  of  research  and  development  in  multiprocessing  systems, 
researchers  are  focusing  greater  attention  on  the  design  of  the  interconnections  between 
components  of  a  system.  Reliability  analysis  and  performance  evaluation  arc  the  essential  aspects 
of  any  study  into  the  effectiveness  of  a  structure.  We  have  begun  investigating  the  reliability 
problem  for  two  types  of  multiprocessing  interconnection  schemes:  multistage  interconnection 
networks  and  hypercubes.  A  group  comprising  Suresh  Rai,  Jerry  L.  Trahan,  and  three  snjdenis: 
T.  Smailus,  S.  Ananthakrishnan,  and  P.  Paragi,  was  formed  to  work  on  a  range  of  subtopics. 
Besides  these,  a  Ph.D.  student  of  Dr.  Rai,  S.  Soh,  and  two  M.S.  students  of  Dr.  Trahan,  A. 
Kulkami  and  R.  Ahmed,  looked  into  some  of  the  related  aspects  of  the  problem.  Dr.  Rai  also 
interacted  with  Dr.  S.  Latifi  of  the  University  of  Nevada  at  Las  Vegas  to  devise  a  strategy  for 
bounding  hypercubc  reliability.  A  bibliography  at  the  end  of  this  report  compiles  our  research 
results  obtained  so  far  on  the  topic. 

Results 

Multistage  interconnection  networks 

We  have  established  simple  and  efficient  algorithms  for  terminal  reliability  (TR)  and 
broadcast  reliability  (BR)  evaluation  of  the  shuffle-exchange  network  with  an  extra  stage  (SENE) 
[1-3].  In  the  SENE,  each  input  is  conneaed  to  each  output  by  a  pair  of  complete  binary  trees  such 
that  the  input  is  connected  by  a  directed  edge  to  each  of  the  roots,  and  the  leaves  of  both  trees  are 
identical.  These  very  regular  paths  fiom  an  input  to  the  outputs  offer  us  the  structure  necessary  to 
solve  the  TR  and  BR  problems  efficiently.  We  first  developed  a  sum  of  disjoint  products  approach 
to  this  problem  [2].  Later,  we  developed  an  efficient  algorithm  for  BR  evaluation  of  an  NxN 
SENE  by  a  recursive  approach,  resulting  in  a  recurrence  equation  that  can  be  evaluated  within  a 
constant  amount  of  time  for  each  of  log  N  levels  of  recursion  [1].  This  result  establishes  that  the 
problem  of  evaluating  the  broadcast  reliability  for  a  SENE  is  not  only  not  NP-hard,  as  is  the  case 
for  a  general  network,  but  has  a  very  efficient  algorithm.  We  extended  this  algorithm  to  an 
efficient  algorithm  for  the  K-terminal  reliability  problem 

If  we  consider  a  deterministic  model  for  a  network  in  which  each  component  is  given  as 
working  or  failed,  then  we  can  study  a  set  of  decision  problems  analogous  to  the  reliability 
problems  of  interest  in  the  stochastic  model.  For  example,  a  terminal  decision  problem  is  the 
problem  of  determining  whether  a  path  exists  from  a  specific  source  to  a  specific  terminal,  given  a 
network  with  a  known  set  of  failures.  Efficient  algorithms  are  of  interest  to  determine  whether  a 
given  network  with  certain  working  and  failed  components  at  a  certain  point  in  time  can  effect  a 
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needed  set  of  connections.  These  algorithms  arc  also  of  interest  as  they  may  provide  techniques 
useful  for  the  reliability  evaluation  of  specific  networks. 

For  MINs,  we  have  developed  a  set  of  approaches  for  the  terminal  decision,  broadcast 
decision,  and  network  decision  problems,  and  for  the  general  S,  T  decision  problem  for  an  input 
set  S  and  output  set  T  [1].  These  approaches  arc  based  on  either  testing  for  the  existence  of  an 
appropriate  pathset  or  testing  for  the  nonexistence  of  a  cutset.  For  the  broadcast  and  network 
decision  problems,  the  cutset  approach  leads  to  bener  algorithms  as  we  can  more  concisely 
describe  and  locate  cutsets  than  pathsets. 

HYPERCUBES 

The  decision  problem  for  hypercubes  is  explored  considering  a  deterministic  model  for  the 
system.  A  real  world  problem  is  ttKxlelled  assuming  a  given  set  of  failures  in  a  cube,  which  may 
be  restricted  to  subcubc  (a  0-subcube  represents  a  node)  failures  only,  or  link  failures  only,  or  both 
subcube  and  link  failures.  The  failures  could  be  of  a  permanent  nature  or  of  a  temporary  nature. 
A  permanent  failure  type  refers  to  a  complete  outage  scenario.  A  temporary  fault  is  nothing  but 
unavailability  of  an  i-subcube  which  is  currently  busy  with  some  processes  and  is  u.volvcd  in 
executing  an  algorithm.  The  question  is  then  asked  how  to  determine  the  size  and  location  of  the 
maximal  dimension  available  (fault-free)  subcubc.  To  help  answer  this  problem  we  have  defined 
two  operators,  namely,  #  and  $.  We  used  these  operators  to  develop  a  method  for  identify  all 
maximal  size,  fault-free  subcubes  contained  in  a  faulty  cube  [3].  (Sec  attached  repon  for  details.) 

Additionally,  we  have  addressed  the  problem  of  dynamically  allocating  subcubes  of  a 
hypcrcube  to  multiple  tasks  [4],  Our  allocation  algorithm  falls  into  the  category  of  available  cube 
techniques  which  offer  the  advantage  of  quickly  recognizing  whether  or  not  a  requested  subcubc  is 
available  in  the  free  list  of  subcubes.  The  allocation  is  done  using  a  best-fit  concept  to  select  a 
subcubc  for  allocation,  which  in  turn  utilizes  the  notion  of  overlap-syndrome  to  quantify  the 
overlap  among  free  subcubes.  Our  technique  has  full  subcube  recognition  ability  and  thus 
recognizes  more  subcubes  as  compared  to  bit  mapped  techniques;  buddy,  gray  code  and  its 
variants.  We  have  also  developed  a  corresponding  deallocation  algorithm.  The  algorithms  work 
with  the  previous  method  few  handling  faulty  nodes  and  links  in  the  hypcrcube. 

A  probabilistic  model  for  hypercubes  is  considered  in  [5,  61.  The  studies  arc  confined  to 
terminal  and  network  reliability  evaluations  for  their  exact  and  approximate  expressions. 

General 

In  addition,  we  have  developed  an  efficient  Boolean  approach,  called  CAREL  [7],  to  solve 
reliability  problems  in  general  networks.  The  effect  of  preprocessing  of  path  or  cut  terms  on  the 
overall  reliability  expression  is  experimentally  determined  in  [8].  Moreover,  a  capacity  related 
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rcliabiliry  problem  where  two  parameters  such  as  availability  of  a  link  and  its  capacity  arc  used  to 
quantify  the  reliability  measure  [9].  All  these  efforts  arc  helpful  in  understanding  different  aspects 
of  the  relitdjility  evaluation  problem. 


Ongoing  Efforts 

We  are  currently  pursuing  the  following  directions  in  our  work. 

For  MINs,  the  standard  reliability  analysis  model  assumes  that  only  switches  can  fail,  that 
links  are  perfectly  reliability,  that  failures  are  staristically  independent,  and  that  a  switch  is  either 
completely  working  or  completely  failed.  Such  restrictive  assumptions  are  standard  for  reliability 
analysis  problems  in  general  networks  that  are  already  intractable  even  with  these  assumptions.  As 
we  have  developed  efficient  algorithr“s  for  reliability  analysis  of  MINs,  we  are  seeking  to  loosen 
these  unrealisric  restrictions  on  the  analysis  [10].  G>nsequcntly,  we  are  developing  methods  to 
incorporate  link  failures,  dependence  between  component  failures,  and  multimode  components  into 
the  analysis.  Currcndy,  we  are  handling  each  assumption  separately,  but  our  intention  is  to 
develop  reliability  evaluation  methods  incorporating  all  these  more  realistic  assumptions.  We  are 
also  working  to  develop  a  network  reliability  evaluation  algorithm  for  the  SENE. 

For  hypercubes,  our  efforts  are  in  two  directions.  First,  we  are  working  to  improve  our 
subcube  allocation  and  deallocation  scliemes.  Second,  we  are  studying  reliability  evaluation  of  the 
hypcrcube,  working  from  the  terminal  reliability  results  obtained  as  stated  above. 
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Abstract 

The  hypercube  architecture  is  a  popular  topology  for 
many  parallel  processing  applications.  Several 
researchers  have  analysed  the  performance  and 
dependability  aspects  of  this  architecture  or  its  variants. 
Fault  tolerance  by  reconfiguration  is  another  important 
problem  in  a  Large  distributed  computing  environment, 
for  continued  operation  of  the  kypercube  multiprocessors 
after  the  failure  of  one  or  more  i~subcubes  andlor  links. 
This  paper  considers  the  fault  tolerance  issue  and  presents 
an  algebraic  technique,  called  ATARIC,  to  analyse  the 
problem.  ATARIC  (Algebraic  Technique  to  Analyse 
Reconfiguration  for  fault  tolerance  In  a  hyperCube)  uses 
algebraic  operators  to  identify  the  maximum  dimensional 
fault-free  subcube,  and  it  thus  helps  in  achieving  graceful 
degradation  of  the  system.  We  antdyse  the  complexicy  of 
our  algorithm.  ATARIC  is  efficient  as  compared  to  the 
algorithm  of  Ozguner  and  Aykanat  [4],  where  the 
inclusion-exclusion  principle  is  used.  Examples 
illustrate  the  approach. 

1.  Introduction 

Hypercube  multiprocessors  have  been  the  focus  of 
many  researchers  over  the  past  few  years.  The  appealing 
properues  of  the  hypcrcubc  such  as  node  and  edge 
symmea7.  logarithmic  diameter,  high  fault  resilience, 
scalability,  and  the  ability  to  host  popular 
interconnecnon  networks,  viz.,  ring,  torus,  tree,  and 
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linear  array,  have  made  this  topology  an  excellent 
candidate  for  many  parallel  pnxressing  applicauons  [1-3]. 
Conceptually,  the  hypercube  interconnection  network  is  a 
mulddimensional  binary  cube  with  a  processing  element 
(PE)  ax  each  of  its  nodes.  An  n'  dimensional  hypercube. 

Bq,  has  2°  processors  and  n2"  ^  links.  Each  processor 
has  its  own  local  memory  and  interproccssor 
communicanon  is  done  by  explicit  message  passing. 

Several  variants  to  the  hypcrcube  topology  such  as 
cube-connccted<ycles  (CCC),  generalized  hypercube 
(GHO,  bridged  hypercubc  (BHQ.  twisted  hypcrcube 
(THC),  folded  hypercubc  (FHQ.  and  star  graph  (21  are 
described  in  the  liieranuc. 

The  suitability  of  a  parallel  architecture  is  evaluated 
by  analyzing  its  performance  and  reliability  aspects. 
Several  researchers  have  invesd  gated  hypcrcube  systems 
using  performance  metrics  such  as  number  of  nodes  and 
links,  connccuvity,  diameter,  average  distance,  cost, 
expandability,  etc.  [3]  with  or  without  faults  m  B^.  Few 

researchers  have  paid  attention  to  dependability  issues. 
Dependability  (in  terms  of  reliability  or  availability) 
{nedicuon  of  a  hypercube  architecture  uses  a  stochastic 
graph  model  of  Bj,.  Note  that  this  prediction  is  quite 

essenual  since  hypercubes  have  the  potenual  of  use  m 
cridcal  applicauons.  Reliability  (availability)  prcdicaon 
is  important  for  systems  with  shon  (long)  mission 
times. 

Fault  tolerance  by  reconfiguration  is  another 
important  problem  in  a  large  distributed  compuung 
environmcnL,  for  conunued  operauon  of  the  hypercube 
muluprocessor  after  the  failure  of  one  or  more  i-subcubcs 
(a  0-subcube  is  a  node  or  PE)  and/or  links.  Algonihms 
for  diagnosing  faulty  processors  and  links  have  been 
given  (5-7).  Once  the  faulty  elements  have  been 
idendfied,  graceful  degradation  can  be  achieved  by 


reconfiguring  the  multiprocessor  and  the  distributed 
algonihm  running  on  the  multiprocessor  :•*]• 
Fortunaieiy,  most  parallel  algorithms  can  be  formulated 
with  the  dimension  n  of  the  hypercube  being  a  parameter 
of  the  algonihm  [8].  Hence,  the  reconfiguraDon  problem 
in  a  hypercube  multiprocessor  reduces  to  finding  the 
maximum  dimensional  fault-free  subcube(s).  A  subcube 
is  a  subset  of  a  hypercube  which  preserves  the  properties 
of  the  hypercube. 

References  [4.8]  provide  simple  procedures  to  find 
the  maximum  dimension  "d"  of  a  fault-free  subcube. 
However,  as  indicated  by  Ozguner  and  Aykanat  [4],  the 
procedure  of  Becker  and  Simon  [8]  does  not  always  find 
the  maximum  dimension  and,  furthermore,  does  not 
consmict  the  set  of  fault-free  d-subcubes.  Ozgilner  and 
Aykanat  [4]  made  use  of  the  principle  of  inclusion- 
exclusion  in  algorithms  that  always  find  d  and  also  the 
number  of  fault- free  d-subcubcs  or  the  complete  set  of 
fruili-fiee  d-subcubes. 

This  paper  introduces  a  new  algebraic  technique  to 
analyse  reconfiguration  for  fault  tolerance  in  a 
hypercube,  henceforth  called  as  ATARIC.  The  ATARJC 
addresses  the  dependability  problem  also.  But.  it  is 
different  from  stochastic  graph  model  based  depend¬ 
ability  measures  such  as  K-terminal  reliability,  subcube 
reliability,  and  task-based  reliability  presented  in  the 
literature  [9-11].  We  present  two  operators,  namely  # 
(sharp)  and  S  (dollar),  to  help  describe  ATARIC.  The  # 
operator  is  quite  general  and  a  modification  to  it  finds  use 
in  PLA  tesung  [12]  and  reliability  computation  of 
general  networks  [13,14,19],  Similar  to  the  algorithm  of 
Ozguner  and  Aykanat  [4],  the  proposed  technique  is  also 
formulated  to  run  on  a  single  processor  which  would 
typically  be  the  host  or  the  resource  manager  in  a 
commercial  hypcrcubc  system. 

The  layout  of  the  paper  is  as  follows.  Section  2 
provides  a  discussion  on  the  hypercube  and  its  propemes. 
The  ATARIC  operators.  #  and  S,  are  presented  in  Section 
3.  Secuon  4  gives  the  aigonthm  and  illustrates  the 
technique  with  examples.  The  complexity  issues 
desenbed  in  Section  5  show  that  our  method  is  more 
efficient  than  the  previous  approach  [4].  Finally,  Sccnon 
6  concludes  the  paper.  An  appendix  provides  a  proof  of 
correctness  of  the  algorithm. 

2.  Preliminaries 

2.1.  Hypercube  concepts 

An  n-dimensional  hypercubc  is  defined  as  Bj,  = 
'*'berc  K2  is  the  complete  graph  with  two 
nodes,  Bq  is  a  envial  graph  with  one  node  and  x  is  the 


product  operation  on  two  graphs  (16].  Let  be  .'noceied 

as  a  graph  CtW^)  with  iVl  =  2^  and  lEI  =  n  The 
graph  G(V£)  is  both  node  and  link  symmetric.  Eac.i 
node  in  G(VJE)  represents  a  processor  and  each  edge 
represents  a  link  between  a  pair  of  processors.  .Vodes  are 

assigned  binary  numben  from  0  to  (2^  -  1)  suc.b  that 
addresses  of  any  two  neighbors  differ  in  only  one  bit 
position.  Using  an  n-tuple.  a  PE  in  Bj^  is  represented  by 

tbn-i..-.bj . bg),  where  bj  €  (0,1).  Two  adjacent 

nodes  which  differ  in  the  i-th  bit  are  said  to  be  m 
diiecnon  i  (0  S  i  S  n-l)  with  respect  to  each  other.  .A 
subcube  in  a  hypcrcube  Bjj  is  a  subset  of  a  hypercabe 

which  preserves  the  propemes  of  a  hypercube.  It  is 

represented  by  an  n-tuplc  (O.ljt)”.  Coordinate  values 
"O'  and  "r  can  be  referred  to  as  fixed  or  bound 
coordinates  and  "x"  as  free.  An  i-dimensional  cube  tor  1- 
subcube)  in  Bjj  has  (n-i)  bound  coordinates  and  i  free 


coordinates. 


n-i 

Hence,  there  are  2 


different 


possible  i-subcubes  in  Bjj.  We  will  use  the  terms  node 
and  0-subcube  interehangeably  throughout  the  paper  since 
they  denote  the  same  object.  For  links,  we  introduce  a 
different  notation  to  differentiate  between  a  link  and  a  l- 
subcube.  Here,  a  link  has  coordinate  values  as:  0. 1,  and 
q.  For  example.  lOOq  denotes  the  link  with  end  nodes 
1000  and  1001  in  Figure  1.  An  n-tuple  describing  a  link 
contains  exactly  one  q.  The  position  of  the  coordinate  q 
in  one  of  the  n  coordinate  positions  indicates  the 
adjacency  direction  for  the  end  nodes.  The  reader  is 
suggested  to  refer  to  [16-18]  for  other  intcresung 
properties  of  a  hypercubc  graph. 


2.2.  Fault  models 


The  reconfiguration  problem  for  hypcrcubes  is 
explored  considering  the  faults  located  at  i-subcubes 
and/or  links.  In  Figure  1,  when  a  2-subcubc  xxlO  is 
fiiulty,  we  assume  that  ail  the  four  nodes  fotming  the  2- 
subcube  along  with  their  interconnecung  links  are 
unavailable.  For  i=l,  a  1 -subcube  consists  of  a  link  and 
its  two  end  nodes.  Thus,  a  faulty  I -subcube  assumes  the 
entire  group  consisting  of  the  link  and  its  two  end  nodes 
are  unavailable.  Node  failure  is  considered  as  a  special 
case  of  i-subcubc  fault,  where  i  =  0.  We  assume  that  a 
node  and  ail  edges  connected  to  that  node  are  removed 
from  the  graph. 

A  link  failure  has  the  effect  of  deleting  the  parucular 
link  from  G(VJE).  We  consider  a  link  failure  to  be  total. 
i.e.,  we  do  not  assume  the  case  where  a  full  duplex  link 
fails  in  one  direcuon  but  functions  m  the  other  direcuon. 


This  assumpuon  allows  ihe  use  of  an  undirected  graph  as 
a  network  model  as  opposed  to  a  directed  graph  (51. 

Note  mat  a  link  and/or  node  may  be  faulty  due  to  the 
presence  of  some  hardware  failure  in  the  system.  When 
some  task  is  currently  being  executed  on  an  i-subcubc, 
the  said  i-subcube  is  temporarily  unavailable  and  may  be 
considered  as  faulty  &om  the  viewpoint  of  reconSgunng 
the  multiprocessor  to  run  an  additional  task. 

Let  fi=  0000  and  fos:  0100  be  two  faulty  nodes  tn  84 
shown  m  Figure  I.  The  faulty  processor  (2  belongs  to 
the  2-subcubes  xxOO,  xlxO,  OxxO.  xlOx.  OxOx.  Olxx  and, 
therefore,  destroys  these  subcubes.  The  total  number  of 

f '  1 

i-subcubes  destroyed  by  a  faulty  processor  is  -  i  j  (^1- 

Notc  that  the  set  of  i-subcubes  destroyed  by  a  number  of 
faulty  PEs  may  not  be  disjoint.  For  example,  OxOx, 
xxOO,  and  OxxO  are  also  covered  by  the  fault  fi.  In  what 
follows,  we  describe  the  coordinate  #  and  S  operations  and 
discuss  an  efficient  technique  to  locate  the  maximum 
dimension  (fault-bee)  subcube  using  these  operators. 

3.  ATARIC  operators 

The  #  and  S  operators,  defined  below,  are  used  to  find 
a  non-fauUy  maximum  dimension  subcube  in  the 
presence  of  subcube  and  link  failures.  To  give  an 
algebraic  definition  we  Erst  define  coordinate  #  and  S 
operaDons  as  given  in  Table  1.  Let  Cf  and  fg  be  two 
cubes  of  length  n  such  that 
Ct  =  (an- 1  .-mi.-.mo)  and 
^s  “ 

whcrca^e  {0.1.x)  and bj €  {0.1j()(bi€  (0,l.q))  where 

the  fault  type  is  a  subcube  (link)  failure.  The  #  operation 
between  c^  and  fj  is  defined  by. 

r 

{ 

j  c,  ;  if  a,’  #  h(  =  y  for  any  i 
#  /j  =  ( 0  ;  if  a,-  #  h;  =  r  for  all  j  (1) 

!  IJ  otherwise. 

where  /*  =  {i  1  jj  #  h;  =»  o,-  =  0  or  1/ 
If  C  is  a  set  of  cubes,  C  =  {cj . Cj,}.  then  define  C  #  fj 

h 

a  U  Cj.  #  fj.  The  sharp  (#)  operamr  is  introduced  by 

Miller  [151,  and  its  application  to  PLA  testing  and 
reliability  analysis  ui  general  networks  is  described  in 
[12-14]. 

The  S  operator  is  similar  U)  that  of  Equation  (1). 
Table  I  illustrates  the  coordinate  S  operator.  Let  c^  S  fj  = 


c^  if  aj  S  bj  =  y  for  any  1;  else,  let  S  fj  =  X  Y  w 
2,  where  tor  some  j,  aj  S  bj  =  t,  and 

f^tif  ajSbj  =  c  for  some  j  and  a, -56, ■  =  i  for  all  i  j, 

1  U  otherwise,  where 

P  =  (l  I  aiSbi  =  a,  =  0  or  l}. 

V  =  (an.i....,aj+1.0.aj.i,...,ao).  and 

2  ^j+l'lmj.i,....a()).  (2) 

Note,  both  #  and  S  operators  are  non-commucauve  (i.e., 
aj  0  bj  bj  o  a;  ,  where  "o"  may  describe  #  or  S 
operators).  The  followuig  properties  of  the  “o  '  operator 
follow  immediately  &om  the  definition. 

1)  c^  0  fg  =  c^;  if  Cj.  fj  =  0 

2)  Cf  0  fj  ^  Cj 

3)  Cf  0  fj  fj  0  c^ 

4)  (Cj-  o  fj )  0  fjn  ^  Cf  o  (  fj  0  fjn);  non-associanve 

5)  (c^  U  fs )  0  fju  =  (Cr  o  fjji )  U  (fj  0  f^;,) 

6)  (Cr  n  fj )  o  ffn  =  (Cr  0  fn, )  n  (fj  o  f^) 

From  5)  and  6)  it  is  obvious  that  the  operators  satisfy  the 
distribuuve  law  over  the  u  and  n  operations. 
Moreover,  the  foUowing  interesting  property  is  also 
p(»sessed  by  these  operators. 

^  (‘t-  0  ®  ^m  *  i®r  0  ®  ^s 

Miller  [151  provides  interesting  reading  material  f«  some 
of  these  properties. 

Example  1.  Let  0100  be  a  faulty  processing  element  m 
84.  Using  Cj  a  xxxx  (i.e.,  84  is  assumed  to  be  non- 

faulty  inidally)  we  have  - 
cj  #  f^  =xxxx 
0100 

101 1  ;  using  coordinate  operation. 

From  Equation  (1)  we  get  - 

Cj  #  =  (Ixxx,  xOxx,  xxlx,  xxxl). 

Example  2.  In  85,  let  xxlOx  describe  a  subcube 
unavailability.  Assuming  cj  =  xxxxx,  we  get  - 
Cj  #  fj  =  xxxxx 
xxlOx 

zzOlz  ;  using  coordinate  4  operaoon. 
From  Equation  (1)  we  get  - 
C[  #  fj  =  [xxOxx.  xxxlx). 

Note  that  C|  #  =  (xxOxx,  xxxlx)  provides  the 

informauon  about  working  or  available  subcubes. 

Example  3.  Consider  a  faulty  link  OOq  (=  fi)  in  83.  For 
Cl  a  XXX,  we  have 
Cl  S  f]  a  XXX 

OOq 


lit 


;  using  coordinate  S  operaDon. 


From  Equations  (1)  and  (2)  we  get  - 
cj  S  fi  =  ( Ixx.  xlx.  xxl,  xxO). 

The  first  wo  values  are  obtained  for  t  =  z  and  the  next 
two  values  are  for  t  =  1  or  0.  Thus,  the  fault-free  2- 
subcubes  are:  Ixx.  xlx,  xxl.  xxO. 

4.  Algorithm 

The  operators  discussed  in  Section  3  are  useful  in  the 
understanding  of  ATARIC.  The  steps  of  the  algorithm 
are  as  follows. 

Step  0.  [Given!  Fault  list  fj.  f2  ....dm-  An  element  fj 
describes  a  subcube  and/or  link  failure  and  uses 
the  representation  desenbed  earlier. 

Step  1.  [Inioalize]  Cj  =  {cj),  where  c^  »  x  x  x  ...  x. 
The  cube  cj  is  an  n-tuple  and  assumes  that  B,, 
is  non-faulty  (available)  initially.  Set  i  ^  i. 

Step  2.  Compute  Cj.4.1  =  Cj  0  fj. 

where  the  operation  "o"  is  either  #  or  S 
depending  on  fj  representing  a  subcube  or  link 
failure,  respectively.  Note  that  may  have 
one  or  more  than  one  subcubes. 

If  Cj+i  =  0.  go  to  step  4. 

Step  3.  Increment  i  *  i-fl. 

Check  i  >  m 

a)  if  yes,  go  to  step  4. 

b)  if  no,  go  to  step  2. 

Step  4.  Stop. 

niusiracing  Example.  Let  the  faulty  elements  in  a  6- 
hypcicube  Bg  be 

fl  =  011000  f2  =  000101  f3»001ql0 

f4=  100010  f5=  110111  101101 

Note  fj,  f2  ,  fa  ,  f5  .  and  fg  describe  faulty  nodes  while 
f3  describes  a  faulty  link.  Imdaiizing  cj  toxxxxxx 
and  following  the  steps  of  the  algorithm,  we  have 
Cl  #  ff  =  (Ixxxxx,  xOxxxx,  xxOxxx,  xxxlxx,  xxxxlx, 
xxxxxl} 

C2  #  f2  =*  { Ixxxxx,  xxxxlx,  ...j 
C3  S  f3  =  ( Ixxxxx, ,..) 


Cg  #  fg  =  (xxxOxl,  xxxlxO, ...). 

Thus,  two  fault-free  4-subcubes  exist  that  can  be 
used  to  reconfigure  Bg.  In  this  example,  most  of  the 

details  are  suppressed  to  maintain  the  readability  of  the 
paper.  We  shall  give  the  details  shortly. 

Theorem.  For  any  n-dimensional  hypercubc  in  whk:h  we 
arc  given  a  list  of  faulty  subcubes  and/or  links,  the  non- 
faulty  subcube  can  be  identified  using  the  algorithm. 


Proof:  Refer  to  the  appendix.  I 

The  aJgonthm  desenbed  here  is  good  for  computer 
simulanon.  .An  algebraic  technique  formulated  using 
these  concepts  is  presented  nexL  To  help  understand  the 
technique  we  need  the  following  represcniauons  and 
definitions.  _ 

PI.  We  represent  a  fault  fj  usmg  noiauon  [a,  a,  a) 
where  an  uncomplemented  (complemented)  variable 
denotes  1  (0).  An  a  denotes  the  ’q"  of  link  failure. 
An  absent  variable  in  a  posiuon  represents  the  ".x" 
Therefore,  a  faulty  node  01 10  is  represented  by  a  b 
c  d,  a  sub<^^  faul^llx  by  ab.  and  a  link  fault 
OOlqlOby  a  bede  f. 

P2.  A  D-opcraior  operates  on  fj.  It  modifies  fj  using 
following  transfotmaiions: 

AND(  ■ )  OR(-k) 
a  a 
a  — ^  a  _ 
a_~*  (a-t-  a)  _  _ 

Thju,  D(  abc  (^  =  (a-i-  b-t-  c  +  d)  and  D(  a  b  c  d 
e  f)3:(a-*-b-»-  c  +  d+  d  +  e  +  f).  Note,  we  have 


used  •  (+)  and  n  (u)  interchangeably.  .Also  the 
Boolean  idenaty  of  the  type  s  +  a  ^  I  in  this  case. 
However,  the  identity  a  a  is  soil  applicable. 

P3.  The  "0"  operator  ("o"  star  ds  for  #  or  S)  is.  then, 
described  as: 

C2  *  D(fi)  and  (4) 

q„l=CiD(fi). 

Here,  juxtaposition  of  Cj  with  D(fj)  denotes  the 
Boolean  AND  operation  and  Cj  Dffj)  can  be  expanded 
using  Boolean  rules  [15].  Equation  (4)  has  the  same 
effea  as  Equation  (3). 

Using  PI  through  P3  above,  the  illustrating  example 
is  solved  as  follows. 


Binary 

fi  representation 
fl  0  1  I  00  0 

f2  0  0  0  1  0  1 

f3  0  0  1  q  1  0 

f4  I  0  0  0  1  0 

fs  110  111 

f6  10  110  1 


Vanable  represemahon 
b5  ba  b2  bo 
F5  ba  b3  b2  bi^ 
bs  _b4  ^  1^  bi  bo 

b5  ba  b2  bi  bo 
bs  ba  b3  b2  bi  bo 
b5  ba  b3  b2  bi  bo 


(1)  C2  =  f>(/i)  =  (bs  ba  +  ^  -t-  bl  ^  ^  ) 

(u)  C3  =  CiDifi)  =!(b5-t-ba-i-^-t-b2  +  bi-^bo) 

{b^  +  bi b[  bq) 

=  b5  +  b|  b4b3  b^b^  +  babo  b^b^  b^bi 
■^b^bo  babi  ♦  b^bi  +•  b^bo  babo  bxbo  *  t>o_bo 


iiii)  C4  =  CyD(J-i ) 

~  bi  ^  *  i>l  ~  Ch  ^  bf}) 

=  i>5  -  Oit^  -  bit>2.  ■*•  b^i^  ^4^^  ■*■  b^bQ 
■rbiOQ  -  i>3i>2  *•  bjb^  -  ^>300 

■^b^bi  -  ^>2^  *  b^bcj  » 

-higher  order  terms 

(iv)  C5  =  CiD<f^)  =  6564  f  £>5i?3  -- 

-  ^5^35  »  i74i>3  *  bibi  *  ^461  1-  b^bf^  +  ^3^*2 

—  i>3^  ■»■  b^b^  ■“ 

♦  higher  order  terms 

(v)  C^  =  ■*■ 

«•  65i>i  *  bi^  higher  order  terms 

(vi)  C7  =  C6Z)(/6 ) 

=  ^i>0  ■»•  ^>2^  ■*■  higher  order  terms 

The  cubes  b2  bo  and  b^  bo  m  binary  representation 
give  tlie  results  as  (xxxOxl.  xxxlxO',.  While  compuar.g 
the  fault-free  -t-subcubes  wc  have  suppressed  the 
infonnauon  regarding  higher  order  tenns.  These  are 
relevant  only  when  we  fail  to  compute  a  non-faulty  4- 
subcube.  The  noaon  may  be  extended  to  enumerate  fault- 
free  d-s  ibcubcs.  Here,  we  should  initially  use 
mtermediate  tc-ms  having  cardinality  j,  I  S  j  S  d.  Notes 
that  terms  with  cardinality  j  correspond  to  n-tuples  with  j 
bound  coordinates  and  n-J  free  coordinates.  Terms  with 
cardinality  (d+l)  or  higher  are  kept  with  a  different 
group/bm.  This  bin  is,  obviously,  useful  when  we  are 
unable  to  generate  a  d-subcube.  In  this  way,  we  are  using 
a  pruned-tree  approach  to  contain  the  size  of  intermediate 
terms  generated  by  the  algorithm. 

5.  Complexity  analysis 

We  now  analyse  the  time  complexity  of  the 
algorithm  presented  in  Section  4.  We  assume  that 
computing  fj  for  one  cube  Cj.  and  for  one  cube  can 
be  aone  in  one  tune  step  for  each  resulting  cube.  We 
assume  that  each  fault  is  a  0-subcube  (that  is,  a  node)  as 
this  will  produce  the  worst  case  bounds,  ■■ijr  a  given  list 
of  m  faults,  the  algor thm  computes  Cj+i  *  C,  0  fj  on 
each  of  m  iterations.  Each  may  be  a  set  of  cubes,  so 
our  object  is  10  bound  the  number  of  cubes  in  Cp 
Initiaily.  Ci*(xxx...  x).  By  definition  of  the  *1  and 
S  operators,  in  'he  worst  case,  C2  may  contain  n  cubes; 

X  X  ...  X  oq,  X  X  ...  X  aj  X . On  x  ...  x.  where  aj  e 

(0.  I } .  In  the  worst  case,  a  set  of  cubes  produced  by  c,.  # 


fj  may  contain  at  most  as  many  cubes  as  there  are  x  s  m 
c^,  and  each  of  those  cubes  wUl  contain  one  less  x  oun 
Cp  Hence,  Cj  may  contain  at  most  n(5-l)  (n-i-*-!!  = 

n!/(n-i)!  cubes,  ■vher-  n  is  the  dimension  of  the 
hypcrcubc  'inder  considerauon.  Actually,  with  n 
symbol  posiuons  and  3  possible  symbols,  [0,  I,  x). 
there  are  at  most  3  possible  cubes.  Let  v  be  the  least 
value  of  i  such  that  n!/{n-i)I  >  3”.  .Note  that  nin-l) 

(n-i+l)  <  n'.  So  the  tune  to  compute  m  itcrauons  of 
the  algonthm,  for  m  ^  v,  is 


m 

I 


n! 

(rt-i)! 


< 


CXn'^. 


And  the  ome  to  compute  m  iterations  of  the  algonthm. 
for  m  i  V,  is 


I 

i=l 


nl 

(n-t)! 


i=v+l 


1=1  l=v^t 


=  CKn'')  +  (m-v)2^”^ 


<  m2 


,U(n) 


In  terms  of  N.  the  size  of  the  hypcrcubc.  the  ome 


complexity  for  number  of  faults  m  S  v  is  0(Gog  N)'")  = 
0(N),  and  the  time  complexity  for  m  >  v  is  OfmN). 

The  above  time  complexity  desenbes  the  aroe  to 
generate  a  list  of  all  fault-free  subcubes,  given  m  failed 
nodes.  If  we  wish  instead  tu  compute  a  list  of  ail  fault- 
free  subCubes  of  dimensioo  at  least  o-k.  then  we  can 
obtain  a  beaer  umc  complexity.  In  this  instance,  the 
cubes  with  k  or  fewer  0  or  1  symbols  in  their 
representanons  (that  is.  k  or  fewer  bound  coordinates) 
correspond  to  cubes  of  dimemion  n-k  or  higher.  If  m  S 

k,  then  we  again  obtain  a  dme  complexity  of  On'^). 
But  if  m  >  k.  then  we  obtain  the  following  ume 
complexity. 


k  ,  m 

y_iL_^  y 

j=I  ^  '  ,=*>1 

k  m 

1=1  j=J:+.l 


n! 

(n~k)\ 


=  0{n*^  -t-  0({m-k)nS 
*  0((m-k)nS- 


This  ume  improves  on  Ozgiincr  and  Aykanat's 


aigoriihm  [4]  chat  rtquues 


0\  mk\ 
K 


lime  to  locate  the 


available  subcubes  of  dimension  n-k  or  greater. 

The  ATARIC  procedure  will  have  a  significantly 
beaer  expected  ume  complexity,  however,  for  a  random 
set  of  faults.  Let  us  call  a  cube  with  k  bound  coordinates 
and  n-k  free  coordinates  as  a  type  k  cube.  Each  cube  m 
set  Cj  can  be  of  type  j.  for  I  5  j  S  i-1.  The  analysis 
above  assumed  that  each  type  k  cube  would  in  a  single 
iterauon  fragment  into  k  type  k-1  cubes.  By  the 
definition  of  the  and  S  operators,  a  type  k  cube  will 
either  fragment  into  k  type  k-1  cubes  or  remam  as  a 
single  type  k  cube  or  disappear.  A  type  k  cube  will  only 
fragment  if  each  of  the  k  O's  or  I's  in  the  c,.  cube  matches 
exactly  with  k  identical  0  or  1  bits  in  the  faulty  element 
fj;  otherwise,  the  Cj  cube  remains  as  a  single  cube  or 

-k 

disappears.  For  a  random  fault  fj,  the  probability  is  2 

•1^ 

that  the  cube  Cf  will  fragment  and  1-2  that  it  will 
remain  as  the  same  single  cube  or  disappear.  Therefore, 
the  number  of  cubes  m  is  much,  much  less  than 

0(n"*). 


6.  Conclusion 


This  paper  has  described  two  operators,  namely  #  and 
S.  The  sharp  (#)  operator  is  used  extensively  for 
geneiaong  test  set  for  logical  faults  in  PLA  and  also  for 
reliability  evaluaaon  in  general  networks.  The  dollar  (S) 
operator  is  introduced  in  this  paper  for  the  first  dme. 
These  two  operators  arc  the  main  features  of  ATARIC. 
The  proposed  technique  is  straightforward  and  efficient  as 
compared  to  previous  algorithms  [4,8].  We  plan  to 
extend  the  concept  for  the  subcube  allocation  and  task 
migrauon  problems  in  hypcrcube  mulnprocessor*. 
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Appendix 

Proof  of  correctness 

The  following  three  steps  are  useful  to  verify  the 

correctness  of  ATARIC; 

Step  1.  If  we  form  c\  #  f;.  where  q  and  fj  are  cubes,  then 
the  sharp  (#)  operator  produces  the  set  of  subcubes  of  q 
which  do  not  intersect  with  fj.  If  a  y  is  obtained  in  any 

coordinate  position,  then  that  coordinate  is  bound  to  0 
(1)  in  Cj  and  1  (0)  in  fj,  so  the  cubes  q  and  are 
disjoint,  and  the  sharp  operator  produces  q. 

Step  2.  The  removal  of  a  link  in  dimension  1  from  a 
cube  produces  the  following  subcubes:  the  set  of 
subcubes  produced  on  the  removal  of  the  1-subcube 
containing  the  link  (that  is.  the  link  and  its  adjacent 
nodes)  and  the  two  subcubes  produced  by  partitioning 
the  (Higinal  cube  along  dimension  i.  These  last  two 
subcubes  have  the  same  aescription  as  the  original 
cube,  except  for  a  0  or  1  in  position  i.  If  we  fonn  q  S 
f{,  where  q  is  a  cube  and  fj  is  a  link,  then  the  S 
operation  is  very  similar  to  the  #  operation.  The 
d^erence  lies  in  the  handling  of  the  variable  t  that  may 
occur  in  the  same  coordinate  position  as  the  q  in  fj. 
The  case  in  which  t  is  set  to  z  produces  the  subcubes 
disjoint  from  the  l-subcube  containing  link  fj;  the 
cases  in  which  t  is  set  to  0  and  I  produce  the  sobcubes 
that  contain  the  endpoints  of  the  link. 


Step  3.  The  result  of  die  operauon  in  Equauon  (3)  is  a 
set  of  cubes  from  Cj  that  are  not  covered  by  The 
iterative  applicanon  gives  a  cover  of  the  subcubes,  if 
any.  not  covered  by  the  fault  list. 


Tahie  1 .  Coordinate  #  and  S  operations 


0 


0  z 

I  y 

X  1 


I 


y 

z 

0 


X 


z 

z 

z 


(a)  #  operation 


1  q 


y  y 

z  y 

0  t 


(b)  Soperatian 


IEEE  TRANSACTIONS  ON 


PARALLEL  AND 
DISTRIBUTED  SYSTEMS 

APRIL  1991  VOLUME  2  NUMBER  2  (ISSN  1045-9219) 

A  PUBUCATION  OF  THE  IEEE  COMPUTER  SOCIETY 


fABERS 

Hypercube  Compuur 

Gusteiing  on  a  Hypercube  Muldccranpater . S.  Ranka  aitd  S.  Sahru  129 

Parallel  Memories 

Compile-Tune  Techniques  for  Improving  Scalar  Access  Performance  in  Parallel  Memories . 

. . . R.  Gupta  and  M.  L  Soffa  138 

Dicdonary  Machine 

A  Generalized  Simultaneous  Access  Dictionary  Machine . Z.  Fan  and  K.‘H.  Cheng  149 

Distributed  Databases 

A  of  Randomized  Strategies  for  Low-Cost  Comparison  of  FOe  Copies . D.  Barbard  and  it  J.  Lipwn  160 

A  Nonblocking  Quorum  Consensus  Protocol  for  Replicated  Data . D.  Agrawai  and  A.  J.  Bernstein  171 

Performance  Evaluation 

The  Effect  of  Scheduling  Discipline  on  Spin  Overhead  in  Shared  Memory  Parallel  Systems . 

. J.  Zahorjan,  E.  D.  Laxawska,  and  D.  L  Eager  180 

Reliability  and  Fault  Tolerance 

Computer  Aided  Reliability  Evaluator  for  Distnbuted  Computing  Networks . S-  Soh  and  S.  Rai  199 

Consensus  with  Dual  Failure  Modes . F.  J~  Meyer  and  D.  K.  Pradhan  214 

Dataflow 

Consistency  in  Dataflow  Graphs . E.  A.  Lee  223 

Parallel  Algorithms 

Uniform  Approach  for  Solving  Some  Classical  Problems  on  a  Linear  Array . D.  R.  O’HaUaron  236 

Parallel  Implemenudon  of  Multiple  Model  Tracking  Algorithms . A.  Averbuch,  S.  Icdkowitz,  and  T.  Kapon  242 


SHOUT  NOTES 

Performance  of  Shared  Memory  in  a  Parallel  Computer 


K.  Donovan  253 


;£££  T^lA^SAC7I0NS  ON  PARALLEL  A>0  DISTRIBUTED  SYSTEMS.  .OU  L  NO  L  .APRIL 


CAREL:  Computer  Aided  Reliability 
Evaluator  for  Distributed  Computing  Networks 
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Abstract —  Tbis  paper  presenu  an  efllcient  tncUiod  to  compute  tlic 
termiiiaJ  reiiabilitr  itlie  probability  of  commuaicnboa  between  a  pair 
of  oodcsl  of  a  distributed  computing  system  lOCS).  We  assume  tlul 
the  grapb  model  G(V,  E)  for  DCS  is  given.  Also,  it  is  assumed  that 
we  have  path  and/or  cut  infonnaiioa  for  the  network  G(V,  £).  Booleaa 
algebnuc  concepts  are  used  to  debne  four  operators  namely,  COMpare. 
R£Duce.  CoMBine.  and  GENcratc.  The  proposed  method,  bencefonh 
called  CAAEL  (Computer  Aided  RELialbility  evaluator >.  uses  these  four 
operators  w  generate  exclusive  and  mutually  disjoint  le.m.ti.l  events, 
and  hence  the  terminai  reliability  expressioa.  Examples  illttstradttg  the 
technique  are  given  in  the  text.  We  have  implemented  CAREL  iwag  bit 
vector  representation  (111  on  Encore  MULTIMAX  320  system.  CAREL 
solves  large  DCS  networks  (having  pathset  of  the  order  of  ltd  and  cnesci 
of  the  order  of  7300  or  moret  with  reasonable  memory  requirement  A 
comnarison  with  existing  aigonthms  reveals  the  computatkinal  eSdency 
of  the  proposed  method.  The  proof  of  correctness  of  CAREL  it  induded 
in  the  Appendix. 

Index  Terms— Bit  vector  represeoiatioa.  Boolean  technique.  CAREL, 
combinatloaal  and  sequeotiai  reliability,  distributed  tysum  rciiability, 
minimal  cooditioiiaj  cnbe.  minpaih.  mincut  opcraion— COM,  RED, 
CMB.  and  GEN,  reliability  evaiuadon  coot 


L  iNTRODUCnON 

DVANCES  in  (Mtnputer  technology  and  the  need  to  have 
(he  computers  communicating  wuh  each  other  have  led 
to  an  increased  demand  for  a  reliable  distributed  computing 
system  (DCS).  An  important  performance  memc  in  the  design 
of  highly  reliable  0C3  network  is  provided  by  its  terminal 
reliability  parameter  [1].  Note,  the  terminal  reliability  refers  to  the 
probability  that  at  least  one  path  exists  between  a  prescribed  node 
pair  in  the  distnbuted  system  [2],  [6].  All  methods  of  terminal 
reiiabilitv  comoutadon  are  known  to  be  NP-hard  [2],  [9],  [10], 
[23],  [26],  [27]. 

Several  aigonthms  dealing  with  the  terminal  reliabtlicy  evai- 
uaaon  are  proposed  in  the  literature  [IH^],  [7],  [12H19], 
[21],  [24],  [25].  These  methods  fall  in  any  one  of  the  fol¬ 
lowing  categones:  state  enumeradon.  decomposidon  technique, 
inclusion-exclusion,  faaonng,  and  sum  of  disjoint  products.  .A 
summary  of  these  techniques,  including  their  relauve  meats  and 
dements,  can  be  found  in  [2]. 

Vanous  techniques  [1],  [3],  [12]-{161,  [18],  [19]  have  utilized 
Boolean  concepts  to  obtain  a  sum  of  disjoint  products,  and  hence 
the  terminal  reliability  parameter  of  a  given  DCS  network.  All 
of  these  methods  stan  with  a  Boolean  polynomial  formed  by 
either  the  success  terms  (minimal  paths)  or  the  failure  terms 
(minimal  cuts)  for  a  given  DCS.  The  paths  or  cuts  are  sequenced 
in  order  of  then  increasing  cardinality  For  each  group  of 
terms  of  the  same  size,  the  ordering  is  lexicographic  followuig 
the  orders  of  the  symbols  of  the  alphabets.  The  ordering  of 
terms  helps  reduce  the  overall  time  complexity  for  generaung 
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sum  of  disjoint  products  iSDP)  expression.  .A  method  [!],  [3i, 
(12H16],  [18j,  [I9j,  then  converts  the  Boolean  poivnomiai  ot 
paths  or  cuts  into  an  equivalent  Boolean  SDP  form  that  represents 
the  disjoint  system  logic.  .Note,  an  SDP  expression  nas  1  i 
correspondence  with  the  system  probability  formula.  .A  drawoacK 
of  the  aigonthms  based  on  the  manipulation  of  Boolean  sum  of 
products  or  tmpiicanis  rs  in  the  iterative  application  of  certain 
operations  and  the  faa  that  the  Boolean  function  changes  at 
every  step  and  may  be  clumsy.  Moreover,  ihe  Booleaa  funcuon 
is  simplified  using  absorption  rules  [20]  and.  thus  requires  a 
considerable  compuiauonal  etfon  [3|.  Therefore,  most  sum  of 
disjoint  products  aigonthms  are  applicable  only  to  small  ;o 
moderate  sized  networks.  Recently,  Vee.raraghavan  and  Tnvedi 
[18]  has  reponed  an  aigonthm  modifymg  the  concepts  given  in 
[3].  Their  method  solves  a  large  DCS  network  (Fig.  II  in  [18  j  is 
same  as  our  Fig.  19  but  for  few  direcnonai  Links  and.  thus,  has 
only  425  paths)  in  166  668  s  (=  46.3  CPU  hours).  Obviously,  we 
soil  need  an  efficient  aigonthm  (on  the  applicanon  of  Boolean 
algebra)  to  solve  terminal  reliability  problem  in  large  distnbuted 
prtxxssing  system.  The  CAREL  (Computer  Aided  RELiabiiity 
evaluator)  provides  a  solution  in  this  direcnon.  CAREL  computes 
Che  termini  reliability  parameter  of  the  DCS  network  of  Fig.  19 
in  less  than  a  minute  CPU  time. 

The  CAREL  uses  Boolean  algebraic  concepts,  and  COM¬ 
pare.  REDuce.  CoMBine.  and  GENcrate  operators.  Refer  to 
the  text  for  a  discussion  on  these  operators.  The  aigonthm  is 
effidently  implemented  as  the  CPU  time  obtained  for  generating 
terminal  reliability  parameter  for  some  moderate  to  large  sized 
networks  is  considerably  less  as  compared  to  that  xponed 
for  other  aigonthms  [ij,  [3|,  [18].  SYREL  [1]  provides  an 
e£6cieni  implementation  scheme  for  E-operator  technique!' 12] 
usmg  set  theoretic  concepts,  muumai  conditional  sets  fMCS). 
and  parntionmg  .MCS's  into  independent  and  dependent  grouos 
to  reduce  the  amount  of  computation  in  generaang  disjoint  sum 
of  products  expression.  The  S  (sharp)  and  @  operators  j3|,  [18] 
reduce  the  total  number  of  disjoint  products  by  groupuig  vanables 
together  such  that  approximately  jCMO  %  saving  m  the  hnai 
rciiability  expression  is  achieved.  CAREL  utilizes  the  advantages 
of  both  SYREL  [1]  and  [3]  and  [18]  to  obuin  low  computation 
ame.  .Moreover,  CAREL  operators  are  bit  implemeouble  (refer 
to  (ext). 

The  layout  of  the  paper  is  as  follows.  Section  fl  provides 
a  generalized  view  of  vanous  existing  Boolean  techniques.  It 
outlines  and  compares  their  basic  philosophies.  The  nouon  of 
dau  representation  is  also  considered.  Section  III  introduces  the 
(lounon  used  and  delines  four  operators  oameiy  COM.  RED. 
CMB,  and  GEN.  The  aigonthm.  its  implementation  details,  and 
vanous  illustrating  examples,  are  desenbed  m  Section  fV  Section 
V  provides  companson  tables  showing  :he  computer  time  for 
evaluating  moderate  to  large  sized  DCS  networks.  It  also  oresents 
a  companson  of  CAREL  wuh  existing  '.echniques.  We  concluoe 
the  paper  m  Seaion  VT.  The  Appendix  shows  the  proof  for  :he 
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correaneis  jf  tne  -ngonihm. 

II,  Preliminaries 

Consider  a  linear  graph  G(V.  £)  moOel  for  the  DCS  nctworK 
sucn  tnat  noaes  ^  i  cages  £)  represcni  comouiers  i communication 
linjcsi-  .Assume  G(^’.  E)  is  free  from  self  loops  and  directed 
cycles.  Eacn  edge  nas  two  states  :  good  fUPl  or  had  (DOWNI. 
Nodes  are  oertect  i  irapcnea  noaes  can  oe  constaered  following  a 
method  given  tn  [21).  Let  the  failures  he  statistically  mdependeot 
(sutisncallv  dependent  failures  can  be  solved  following  a  method 
given  m  [5]).  This  assumption  is  useful  to  make  the  problem 
mathematically  tractable.  A  mmpaih  P.  is  a  path  from  a  source 
node  a  to  a  terminal  node  t  m  G(V,  £).  it  is  formed  by  the  set 
of  UP  edges  sucn  that  no  nodes  are  traversed  more  than  once. 
Pathset  IS  defined  as  uic  set  of  minpaths.  A  cut  ts  a  disconnecting 
set.  All  communication  between  a  prescribed  (a,  t)  node  pau  is 
disrupted  once  the  edges  m  (a,  t)  cut  fail.  Oenne  a  mincui  as  a 
cut  which  has  oo  proper  subset  that  is  also  a  cut.  and  cutset  as 
the  set  of  mtneuts.  Assume  that  either  pathset  or  cutset  between 
a  source  a  and  terminal  t  in  G(V,  E)  is  known. 

A.  Data  Reprtsentanon 

Oau  structure  is  an  important  aspect  of  designing  efSdent 
aigonihms  [8].  Rosenthal  [11]  discussed  the  advantage  and 
disadvantage  of  three  different  lands  of  data  represencactoa  for 
cutset  eoumeranon.  This  section  briefly  describes  one  of  the 
represenutions.  namely  the  bit  vector  represenuQOo  because  it 
is  suitable  for  the  impiementadon  of  the  proposed  method.  A 
minpath  (mmcui)  tn  a  network  with  1  links  is  represented  by  I 
bits.  Ad  up  link  of  the  network  is  denoted  by  a  binary  1.  A  binary 
0  stands  for  a  don't  care  state  (not  a  DOWN  state).  Consider 
the  minpaths  ob.  cd,  ode  and  bee  between  the  (s,  t)  node  pair  in 
Fig.  1.  These  imnpaths  are  stored  in  memory  as 

ob  :  0000000000000011  cd  :  0000000000001100 
ode  :  0000000000011001  bee  :  OOOOOOOOOOOlOllO. 

In  this  example,  we  have  utilized  the  word  size  w  as  16  bits. 
Note,  a  minpath  (mincui)  requires  \l/w'\  words  of  memory.  With 
bit  veaor  represenuuon.  the  storage  requirement  for  a  minpath 
(mincui)  depends  on  the  total  number  of  links  in  the  network 
and  not  on  the  size  of  the  pathset  (cutset).  Coding  and  decoding 
of  path  informauon  into  bit  represeoution  and  vice  versa  may 
add  extra  cosu  as  it  involves  I  bit  tesungs.  However,  this  pre- 
and  posqirocessing  of  mmpaths  (mtncuis)  are  one  tune  operauoo. 
They  are  usually  worth  the  extra  compuunoo  as  the  generaaon 
of  disjoint  events  requires  considerable  manipulauoos.  Moreover, 
the  ability  of  bit  represenution  in  detecung  and  eliminating 
redundant  terms  using  set  iheoreuc  operauons  like  union,  in* 
tersecuoo,  subseu  etc.,  is  an  important  advantage.  To  illusaaie 
the  concept  for  redundancy  checking,  assume  the  reference  term 
X.  and  a  test  term  XY  (which  is  a  redundant  subset  of  X): 


reference 

(X) 

11001 

test 

(XY) 

1110  1 

;0R  operanon 

1  1  1  0  1 

test 

(XT) 

1  1  1  0  I 

,£OR  operation 

0  0  0  0  0 

A  tesult  “0  0  0  00”  shows  XY  redundant.  A  duplicate  term  is 
deteaed  using  the  same  approach.  The  set  tneoreiic  operauons  are 
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the  size  of  tne  aetworK.  The  aumoer  or  me  ume.  ..  wmen  arecti 
the  speed  of  t.ae  set  operauons,  LDcrtases  tne  corncutauon  ttme  a-, 
one  unit  for  every  ti;  addiiionai  linxs.  The  proposed  memoc  uses 
these  ooeraiions  with  COMpart  and  REDuce  operators  reit.- 
to  the  text!  and  we  have  discussed  it  in  detail  in  Scaion  [V- 
B  while  considering  the  impiemenuuon  of  vanous  set  meoretic 
operations. 

B.  Boolean  Techniques  Concept 

Vanous  Boolean  techniques  of  reiiabtiity  evaiuauon  start  wup 
a  sum  of  producis  expression  for  minpaths  or  cutsets  and  coove.-t 
II  into  an  equivalent  sum  of  dlsioint  producis  expression.  In  the 
SDP  form,  an  UP  or  logical  success  (DOWN  or  failure)  sute  of 
a  link  z  is  replaced  by  link  reliability  p  (unreliability  q),  and  the 
Boolean  sum  (produa)  by  the  anthmeuc  sum  (produa).  In  other 
words,  the  SDP  expression  is  mterpreted  direedy  as  an  equivalent 
probability  expression  of  tenmnai  reliability.  If  F,  represents  a 
path  Idenofler  (an  UP  state  of  a  link  m  a  path  P,  has  1  m  f., 
while  a  don't  care  is  represented  by  0),  the  sum  of  products 
expression  F  is  given  by 

P  =  {JF^  (1) 

1*1 

where  n  denotes  the  number  of  minpaths  (cutsets)  between  (5.  f) 
node  pair  in  G(V,  E).  Equadon  (1)  is  modified  either  canonically  , 
or  coDservaciveiy  to  generate  the  equivalent  SDP  expression.  | 
F(disjou>t).  The  conservative  modificaoon  is  usuaUy  preferred.  > 
since  it  is  more  effleiem  compared  with  canomcai  mo^caaon. 
where  2^  events  are  required  to  detenmne  /(disjoint).  (I  is  cbe  / 
number  of  links  in  the  iKtwork.)  A  simple  way  to  geneme  the  | 
tsuiuaily  disjomi  events  in  (1)  is  as  follows: 

F,i-Fj7[+F}7^7^-i — •  •  P,-i  where  7^  denotes 
DOWN  events  of  P, .  The  piobabiliiy  of  UP  (opennonal)  for ! 
an  ttfa  term  F,  7T  o  ' '  P.-t  can  he  evaiiuaied  using  conditional ) 
probability  and  sundazd  Boolean  operanoos  as 

I'*! 

Pr  (P.)  •  Pr  (T:  •  7? •  •  •  7:71/ )  =  Pr  (F. )  n  Pr 

Here,  an  £,  represents  a  condidonai  cube  [20]  and  defines 
condidons  for  a  path  identifier  F,  DOWN  given  F,  as  UP 
(operaaonal).  The  probability  of  the  first  event  Prfp)  can  be 
determined  m  a  straightforward  manner,  smee  the  failures  are 
assumed  to  be  sudsucaily  mdependent  However,  cbe  coetficieni 
PifP,)  requues  further  consideraaon  since  various  terms  wuhin 
£/s  will,  m  general,  be  not  disjoint  [2],  This  necessitates  F.  s 
to  be  made  muiuaily  disjouit  before  we  generate  the  equivalent 
probabliiry  expression.  Note.  P.’s  in  (1)  are  sequenced  in  the 
order  of  then  mcreasuig  cardinality,  and  also  for  each  group  of 
terms  of  the  same  size,  the  ordenng  ts  lexicographic.  Therefore, 
the  disjoint  products  for  any  (m- 1)  size  path  identifier  P, .  where 
m  denotes  the  number  of  nodes  in  G(V.  £),  is  obtained  direaiv 
by  iniersecung  the  complements  of  the  remaining  l-im-l)  links 
of  G(V.  £)  with  P,.  This  observauon  (first  made  m  [12],  and 
then  proved  in  [1])  reduces  computauonai  tune  for  aigonibm; 
based  on  Boolean  concepts. 

Vanous  researchers  [2]  have  given  techniques  which  generate  3 
disjoint  expression  for  (p.  Ft)  pain  m  (1),  and  also  the  ( £,.  £ 
terms  within  an  P..  The  following  three  proposmons  (P  I  throug! 

P  HT)  that  conven  P  mto  PfdisjomU  represent  basic  philosopnies 
for  most  Boolean  methods  in  the  reiiabiliiv  literature 
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Fig.  1.  Bndgc  network. 


P  I;  The  proposition  P  I  defines  intermediate  term(s)  T.  's  as 

1-1 

“  U  ^^'each  literal  of  r.— i  (2) 

where  F‘  =  F\  and  F'  =  F,  OP^  T,.  Here,  F'  refers  to 
the  equivalent  disjoint  product  term(s)  for  F,.  The  operauon 
“OP"  is  a  necessary  disjoinong  operator.  (Table  !  lists  various 
operators.)  The  F(disjoLnt)  expression  is.  then,  given  by 

F(dis]oiat)  =  [JF^-  (3) 

1 

Algonthms  [16]  and  [19]  make  use  of  proposidon  P  I. 

P  II:  For  each  term  F.,1  <  »  <  ri.T,  is  defined  to  be  the 
union  of  all  predecessor  terms  Fi,  Fj,  •  .  F,_i,  in  which  any 
literal  that  is  present  m  both  F,  and  any  of  the  predecessor  terms 
is  deleted  from  those  predecessor  terms,  i.e.. 

r,  =  [J  F,Ig3^(.jj  of  Fi-i' 

j«i 

Consider  F‘  =  Fu  and  define  P  =  Fi  OPi  T,.  Equadon  (3), 
then,  obtains  the  equivalent  F(disjoint)  expression.  Hariri  and 
Ragbavendra  [1],  Rai  and  Aggarwal  [12],  Fratta  and  Montanari 
[14],  and  Bennetts  [10]  have  based  their  methods  on  proposidon 
P  n.  Refer  to  the  Table  I  for  OFj  operator. 

P  lit;  For  1  <  ;  <  Ik,  use  operadon  “OPi”  to  perform 

F^  =  (. . .  ((F,  OPi  F:)OPi  Fi)  OPi . . .)  OF,  F,_,  (5) 

Equaaon  (5)  obtains  a  set  of  disjoint  cubes  conesponding  to 
F,.  Note,  F‘  =  Fi,  and  OFj  represents  an  appropnate 
disjointing  operator.  The  F(dLjoint)  expression  is.  then,  given  by 
(3).  Tiwari  and  Verma  [22],  Gmarov  etaL  [3],  Abraham  [13],  and 
Veeraraghavan  and  Trivedi  [18]  have  proposed  their  techniques 
using  P  in  concept  References  [3],  [13],  [15],  and  [22]  consider 
F/s  in  cuoical  aouuon  [20).  Table  I  lists  operation  OPi  as 
suggested  bv  vanous  lesearchers  in  the  literature  [3],  [13],  [15], 
[18],  [22]. 

Example:  To  illustrate  propositions  P  I  through  P  m.  consider 
a  DCS  network  shown  in  Fig.  1.  For  the  (s.  t)  node  pair,  there 
exists  four  minpaths;  ab,  cd,  ode,  bee.  In  what  follows,  we  explain 
steps  to  generate  the  exclusive  and  mumaily  disjoint  event(s) 
or  cube(s)  for  Fj  (=  ade  ).  This  is  demonstrated  using  typical 
methods  for  proposidons  P  I  through  P  QL  For  uniformity,  we 
keep  the  nounon  [12]  with  a  modificadon  that  the  state  of  a 
DOWN  link  y  is  denoted  as  (1  -  y). 

P  I;  Considering  [16],  F'  =  oh,  and  F*  =  (1  -  ah)ca.  Use 
(2)  to  define  Ti  as  (6  -t-  (1  -  6)c)  for  Fj.  Replacing  OFi  with 
X-operator  [16],  X  (Tj)  is  (1  -  i)(l  -  c).  F’  is  then  obtained 
as  P  -  Fi  X  (Ti).  The  term  F*  is  generated  sunilarly.  An 
expression  for  F(disjoint)is  F(disjoint)=  oh  +  ( 1  -  ab)cd  +  ( 1  - 
6)(1  -  c)ade  +  {1  -  a)(l  —  d)bce. 


TABU  1 

DP. -OPi  USED  WITH  BOOLEaT^  AUCEBRAJC  TECHN1QLE.S 


Proposed  aigomtimi 

Opemor 

Function 

Reierence 

OPi 

Cuoet  Ois)oint  Procedure  ill 

OPi 

Modified  S  operator 

;i8i 

OPi 

S  operator 

;3i 

OPi 

E-operaior 

:i:i 

OPi 

COMPAkEf  )  funenon 

lU!  ! 

OPi 

X-operaior 

|i6i  i 

OPi 

Boolean  neatiioo 

: 

OPi 

keliQve  completnent  and  Procedure  1 

[10)  i 

OPi. 

CMB  (  *  )  opemor 

1 

OPi 

! 

P  0:  We  have  used  E-operator  [12]  to  explain  the  operation 
OPi  and.  hence,  the  concept  behind  proposition  P  Q.  The  terms 
F*  =  oh,  and  F*  =  ((1  -  o)  o(l  -  b))cd  are  computed  using 
[12].  To  generate  F*,  an  intermediate  term  Ti  is  obtained  as 
Tj  =  oh  *  cd  I  =  b  +  c 

Note.  E  (Ti)  is  (1  -  h)(l  -  c).  Hence.  F*  =  F,  E  (Ti). 
Similarly,  we  obtam  F*. 

Equsaoa  (3)  is 

F(disjoint)=  ah-i-((l  -o)-(-o(l  -h))cd-t-(l  -h)(l  -c)ade -?• 
(1  -  a)(l  -  d}bce 

P  IH:  Use  [3]  to  obtain  the  terms  F‘  oh.  and  F*  = 
(1  -ohjed.  Here,  the  S  (sharp)  operator  [3]  subsnmtcs  for  OF). 
The  cube  F®  is  generated  using  -  ((Fi  OPi  F,  )  OFj  ."j  ). 
The  inner  term  Fj  OFj  Fi  gives  (1-6)  ode.  which  with 
Fi  generates  (1  -  6)(1  -  c)ode.  Similarly,  compute  F*  for  F,. 
Equaaon  (3),  then,  gives 

F(disjoint)=:  oh-*-(l  -oh)cd+{l -6)(1  -c)adc-*-(l  - a)(l  - 
d)6ce. 

Note,  F(disjoint)  expression  obtained  from  different  propo- 
sinons  when  expanded  out  should  be  idenocal.  In  Fig.  1,  the 
tenninal  reliability  is  0.97848  when  each  link  is  assumed  to  have 
a  reliability  of  0.9. 

C.  Existmg  Boolean  Techniques— A  Comparison 

Proposidons  P  I  through  P  m  maintain  the  minpaths  or  miocuts 
list  in  memory  (1).  Consider  1  for  UP  link  and  0  for  don  t  care. 
and  utilize  bit  represenuuoo  technique  (discussed  m  Secuon  11- 
A).  The  memory  requirement  is.  then,  f^/lSl  words  per  path 
(cut),  where  1  is  the  number  of  links  in  the  DCS  network  G(V. 
E).  Proposidon  P  I  makes  F,  disjoint  with  respea  to  u‘;|  F\ 
while  propositions  P  11  and  P  HI  utilize  u*;}  F, .  Should  we  have 
similar  operations  to  unplement  (2)  and  (4),  the  prooosition  P  i 
will  require  more  operations  than  that  needed  for  P  II  Generally, 
an  F,  generates  more  than  one  e.m.d.  events  F”s.  Hence,  the 
number  of  events  mvolved  in  u,  F^  is  larger  than  that  in  u,  F. 
For  example.  Table  IV  ( ,V.^  )  shows  results  for »  =  780.  The 
number  of  terms  in  (2)  is  more  than  50  000;  on  the  other  hand, 
(4)  needs  exactly  (i  -  1),  i.e.,  779  terms.  Note,  in  proposition  P 
I.  the  generated  e.m.d.  events  have  to  be  kept  m  the  meraorv 
to  implement  (2),  which  is  not  the  case  for  P  n  or  P  ID.  This 


maxes  Dfooosiaon  ?  !  jeauenuai.  Moreover.  ?  I  jemanos  a  aoye 
tnemorv  space  :o  evaluate  a  large  DCS  neworics.  On  tne  otncr 
hand.  ?  II  or  P  111  nas  tramiat  paniilctism,  maxing  ,t  easier  tor 
Uic  orogrammers  to  implement  them  on  parallel  svsiems.  Overall, 
the  protx)suion  ?  II  or  P  III  provides  advantages  m  corapanson 
VAth  P  I. 

.■\n  anaivsis  oi  pcriormance  comparison  octween  a  typical 
examoie  or  proposition  P  II  and  P  lU  is  discusseo  tn  ;  1  j.  SYREL 
[1],  an  implementation  technique  tor  E-operator  112’,.  is  snown 
to  have  oetter  pcrtormance  in  comoanson  with  S  Hjtxrator  I'Sl  !i 
means  oroposition  P  !I  outpcrtonns  P  III.  .Moreover,  nroposiiton 
P  n  oifers  a  taster  implementation  approach  than  taai  m  P  I  or 
P  III.  The  oit  veaor  impiementanon  f,  maxes  me  realizauon 
of  (i)  w  (word  sue)  tunes  faster  than  generating  12)  eased  on 
proposition  P  I.  Later  m  Section  V.  we  shall  snow  max  CaREL 
outperforms  E-operator  fl21  or  SYREL  [IJ. 


aescTiDes  me  events  that  a  pam  locnuneo  ov  F  s  onerauonj.. 
wniie  pams  F.,  j  =  l,2  i  -  1  tail.  The  RED  ooeraior.  jn 
me  oiner  aand,  is  used  to  remove  me  redundant  conemonii 
cuoes  from  the  generated  set  £, 's.  We  call  me  aonreaunaa.";: 
cuoes  minunai  conamonal  cubes  IMCCi.  [f  we  nave  oniv  .jne 
cube  m  the  MCC  set,  or  the  cubes  are  mutuallv  oisioint  among 
themselves,  we  may  generate  the  disioint  events  F'  auecuv  3..;, 
m  general,  the  MCC's  are  not  disjoint.  Thus,  we  need  me  CMB 
operator  to  create  disjoint  events  of  MCC's.  Refer  to  Scaion  Il-B 
for  the  need  of  maiong  .MCC's  £,  s  mutually  dis)oint.  Finally 
the  last  operator,  GEN,  gives  the  disjomt  events  F' 
i>  COM  !  \  )  operutor:  The  COM  operator  is  used  to  compare 
two  cubes.  The  COMl  (COM2)  verson  desenbes  proposiuon  P 
1  (P  II).  For  P  1,  consider  two  cubes  =  (o,  a,  an  and 
=  ibi  (>i  bi)  where  a,  S  {0,  1}  and  6,  €  0,  1} 

The  COMl  (\)  is.  then,  defined  as 


nr.  CAREL  (CoMnrTER  Aideo 
ROdABtUTY  EvaLUjCOR):  Backchouw 
Section  m-A  presents  a  notaaonai  concept  which  is  use¬ 
ful  to  describe  CAREL.  We  have  defined  COMpare,  REDuce, 
CoMBme.  and  GENerate  operators  in  Seaton  QI-B.  These  four 
operators  form  a  basis  for  our  algonthm  CAREL 

A.  Notation 


if  0,  =  1  in  all  those  <2‘  posinons  where  6  s 
are  -  1?^  It  shows  that  F*  and  F’  are 
mutually  disjoint, 
otherwise 


where,  the  condifionaJ  set  £k  n  (  Ci  cj  ■  •<:<).  A  c,  (»  a,  \ 
6.)  is  obtained  using  the  following  table: 


For  the  DCS  network  G(V.  £),  consider  a  set  of  paths 
P/s  between  source  s  and  destination  t.  The  path  identifier 
F,  identifies  P,  m  the  cubical  notanon  [20]  usmg  a  string  of 
symbols  (0.  1}.  Thus,  an  UP  sure  of  a  link  in  the  P^  has  1  in 
Fj  while  a  don't  care  state  is  represented  by  a  0.  F’  denotes 
exclusive  and  mutually  disjoint  eveot(s)  and  is  generated  for  F,. 
To  beip  obtam  FC  coadifionai  cubes  £/s  (defined  later)  are 
uoliztd.  Both  E,  and  F’  are  in  Boolean  domam.  An  £,  is 
composed  of  complemented  and  absent  vanafales  and  requires 
{  -QF  0}  symbols  while  F^  uses  {  -l3^  0,  1}  [19].  The  3 
represents  a  positive  integer  and  /  is  a  superecnpi  or  index.  We 
use  a  supersenpted  negative  integer  to  represent  a  complemented 
variable  (DOWN  state  of  a  link)^To  help  tUustraie  the  concept 
of  this  new  notaaon,  the  cube  oh  c  d  is  represented  as  (-2’ 
•2^  I  -1^).  Similarly,  the  produa  term  ^  S  9  ts  denoted  as 
(-3‘  -  3‘  -  2^  ~  3‘  -  2’  0  1).  Note  the  following: 

1)  An  uncomplemented,  or  complemented,  or  absent  variable 
IS  replaced  by  1.  — 13',  or  0  in  the  posinon  of  ibe  vanable, 
respecuvely. 

2)  .A  •‘-J'"  reprcsenis  complements  of  3‘  number  of  vari¬ 
ables  which  are  grouped  together.  (-1  signifies  smgle 
variable  complement.  Use  of  index  is  opuonal  with  -I.) 

The  need  of  superscript  or  index  ts  illustrated  with  the  example 
of  a  Boolean  term  uiJuiX  wswi  represented  as  (-2‘  -2*- 
2’  -  P  -  2^  -  2*).  If  the  indexes  are  not  used,  a  notanon  of 
(-2  -2  -2  -2  -2  -2)  for  this  example,  m  all  likelihood,  would  be 
wrongly  mierpreted.  The  advantage  of  (-!?',  0.  1)  notaaon  lies 
tn  Its  uniqueness  in  handling  complemented  vanables  that  are 
groupea  together  [3],  (18),  [19]. 


b. 


0 

1 

0 

0 

0 

-a 

1 

0 

0 

0 

Here,  a  represents  the  total  number  of  places  where  (0,  I)  pan 
in  (Ft,  F^)  occurs.  For  example,  consider  F*  as  (0  1  0  1  0  0  1), 
and  F'  ts  (1  -  2‘  -  2'  -  2*  I  -  2*  0  ).  The  COMl  operanon 
obtains  as  given  below: 


Fk  0  1  0  1  0  0  1 

F’  1  -2'  -2*  -2^  1  -2*  0 

£k  -2*  0  0  0  -2‘  0  0 


Note,  a  >  2  as  two  (0,  1)  pairs  are  present.  On  the  other  band, 
the  (  \  )  operation  with  F^(l  -  2‘  -  2^  -  2*  1  -  2^  0)  is  shown 
to  be  null  (d>). 


Fk  0  1  0  1  0  0  1 

F‘  1  -2‘.  -2^  -2‘  1  -2^  0 

£k  ouii  set  (9) 


B.  CAREL  Operators 

In  what  follows,  we  discuss  four  operators  COM,  RED.  CMB. 
and  GEN  used  in  CAREL  Given  a  set  of  path  identifiers  w/.,  F,, 
we  want  to  generate  disjoint  events  F*  generated  from  the  F,  for 
all  I.  The  COM  operator  generates  a  set  of  conOioonal  cubes 
£,,  ;  =  1.  I  -  1  for  an  F,.  The  condioonal  cube  set  £/s 


For  proposition  P  H,  the  COMpare  (  \  )  operator  requires  two 
cubes  Fk  =s  (  Oi  Oi  •  •  •  Oi  )  and  F,  =  (  6.  6,  •  •  6,  i;  where 
both  a,,  6,  €  {  0. 1  }■.  Here.  COM2  is  defined  m  a  straightforward 
manner  as  :  £k'  =  Fk  \  F,  »  Fk  p<  F,  df  Ft,,  where  “pi"  anc 
“df"  are  set  operators.  When  the  operands  F*.  F,  €  {  0.1  die 


SOH  A>D  RAi.  CAREL  COMPUTER  MDED  R£LlABIL]T>  cVALCArOR 


operator  “pt"  and  "df"  are  bitwise  or  and  eor,  rcspcaively.  Let 
a  oe  the  toiii  number  of  I’s  present  wuh  £♦'.  Replace  I’s  m 
£t  by  -a'  :o  generate  the  conditionai  cube  E,,.  For  example, 
consider  ■‘pi”  and  “d/”  operauons  for  Ft,  and  F,  given  below: 


Fk  0  I  0  1  0  1  0 

F,  0  1  0  0  1  0  I  .  use  'pt  (bitwise  or)  operation 

0  1  0  I  I  i  I 

Ft,  0  1  0  I  0  I  0  ,  use  'df  (bitwise  EOR)  operation 

E;  0  0  0  0  1  0  1 


(i)  £'. 

0  : 

[  0 

1 

0 

P*' 

')  : 

;  { 

-L 

0 

,  "pr  {OR  operation) 

0  i 

1  i 

1 

i 

0 

c 

—k 

0  ' 

;  i 

0 

,  'Jf  'EOR  operauon) 

j  i: 

)  0 

T 

0 

;  means  £[  :s  reoundani 

(ii)  £', 

0  1 

.  0 

1 

0 

£'k 

1  1 

.  1 

0 

0 

;  "pr  (OR  operation) 

1  1 

.  1 

1 

0 

£k 

i  1 

,  1 

JL 

0 

,  'df  (EOR  operauon) 

£k 

0  L 

>  0 

1 

0 

;  nonnull  means  ream  boih  cubes 

The  conditionai  cube  £*  is.  then,  obtained  from  £*'  by 
replacing  I  by  -2‘  m  the  positions  of  1.  Thus.  £*  =  0000  - 
2'0-2‘.  .Vote,  the  COM  I  operator  detects  and  eliminates  some  of 
the  redundant  terms  while  other  redundanaes  are  deleted  by  the 
RED  operator.  The  COM2  operator  does  not  check  redundancy. 
Nonetheless.  COM2  operator  is  bit  impiementable.  and  offers 
a  great  advantage  over  COMl  from  the  aspects  of  computer 
memory  and  speed. 

2)  RED  {!)  Operator;  Consider  two  condidonal  cubes  E,  = 
(  Cl  Cl  •••  C(  ),  and  =  {  di  dt  ■■■  di  ),  where 

c,  ,  d,  £  0}.  For  a  ,  S  £  the  RED  (  /  )  operauon 

is  given  by 


£,/£k  =  « 


if  c,  s  -a  in  ail  those  a  positions 
where  d,  ^  -6  and  6  >  a 
Ek',  if  (f,  s  in  all  those  S  positions 
where  c,  =  -a  and  a  >  6 
Retain  £;,£*;  otherwise. 


Note,  by  REDucing  either  E,  or  Ek,  we  remove  redundant 
produa  terms.  The  remaining  nooredundant  £,'$  are.  henceforth 
said  to  form  a  minimal  condiiional  cube  (MCC).  The  following 
examples  illusoate  RED  operation: 


(i)£;  0  -2'  0  -2‘  0  (ii)£,  0  -2*  0  -2‘  0 

0  -3»  -3^  -3^  Q  Ek  -3^  -3^  -3»  0  Q 

E,  0  -2‘  0  -2‘  0  Retain  E,  and  Ek 

For  (i),  -a  =  -2‘.-<5  =  -3’,  and  S  >  a.  Moreover, 
-2*’s  are  m  both  positions  where  -3”s  are  present  Thus,  £* 
is  redundant  and  the  result  is  £,.  A  similar  explanation  follows 
for  (ii).  The  RED  operator,  defined  above,  is  suitable  for  P  I. 
For  notatiooai  simpliaty,  let  us  call  it  REDl.  7'^th  proposidon 
P  Q,  a  simpler  version  (RED2)  is  adopted.  .Note,  the  COM2 
operation  generates  subcubes  Ek'  which  contains  O’s  and  I's 
only.  Usmg  £*  ’5.  and  the  concept  explained  m  Secdon  Q-A, 
the  redundancy  checking  requued  in  R]^2  operator  is  brought 
down  to  set  theorenc  operations  ■'pi”  and  “d/."  This  observauon 
will  help  make  RED2  impiemenudon  faster  as  compared  wuh 
that  for  REDl.  Using  R^2.  the  examples  (i)  and  (ii)  can  be 
solved  as 


3)  CMB  (*)  Operator:  The  CoMBine  f*)  operator  processes 
MCC  £. ’s  as  Its  operands.  Before  applying  CMB.  parrmon  the 
set  of  £, 's  into  uiOepenOent  (IG)  ano  dependent  (DG)  groups. 
The  MCC's  that  belong  to  IG  are  already  mutually  disjoint  among 
themselves.  Thus,  generating  the  disjoint  events  F'  from  IG  is 
straigncforward.  It  has  been  observed  in  [1]  that  most  (a,  rj  paths 
in  large  DCS  networks  (which  are  loosely  connected  type)  do 
not  have  common  elements  among  them.  The  partitionmg  of  E.  s 
into  IG  and  DG  can  be  embedded  in  the  impiementanon  of  RED  { 
/ )  operauon  (refer  to  Secnon  fV-B),  which  avoids  an  unnecessary 
taxing  of  the  implementauon  of  CMB  (  *  )  operator.  Moreover, 
the  processing  cost  (as  will  be  clear  later  in  this  secnon)  for 
IG  is  far  less  than  that  for  DG.  the  overall  unprovemenc  in  the 
performance  of  the  algonthm  is  obvious. 

Defininon:  (Toosider  MCC's  £,  and  Ek  whose  elements 
c,,  d,  0).  For  I  <  I  <  i,  if  there  exists  at  least  one 

(c,,  d,)  pau  for  c,4,  ^  0.  the  £,  and  Ek  said  to  form 
dependent  group  (DG).  As  an  example  consider  £,  as  (-2*0-  2' 
0000),  and  £»  as  (-3*00-3*-3*  0  0).  Here.  E,  and  £»  belong 
to  DG.  smee  they  have  a  common  element  m  posinon  1.  Use  this 
defimtloD  to  selea  out  DG's  from  nooredundant  MCC  £,  s.  The 
remaining  £.  terms  form  IG’s.  The  (£,,£*)  entry  in  IG  has 
no  common  elements  among  themselves.  It  means  the  (c,.d,) 
pair  wiU  always  be  of  the  types  f0.0),(-a.0),anQf0. -e). 
Considering  this,  terms  like  (0  -  2^*000  -  2*  0)  and  (0  0  0  0  0  0 
-1)  belong  to  IG.  Note,  these  terms  are  independent  wuh  both  E, 
and  Ek  considered  above.  For  aouuonai  simpliaty,  we  denote 
the  elements  of  mdependent  (dependent)  group  by  IG,  {DG, ). 
For  I  £,  1=  r.  j  IG,  i=  Ti,  and  I  DG,  !=  -r.rj  -  ^  =  - 
Note,  an  element  £,  belongs  to  cither  IG  or  DG.  The  CMB 
operator  differenuates  our  algonthm  CAREL  wuh  [1],  [12],  [16], 
(19).  We  discuss  the  differences  in  Secnon  V  The  following  two 
cases  define  CMB:  CASE  1  (CASE  2)  is  used  for  independent 
(dependent)  group. 

CASE  I  {(2MB  for  independent  group):  For  I  <  j  <  n,  the 
CMB  operates  iteratively  as 

;G;^,  =  IG,  .  IG,,,  (6) 

where  is  a  “pi"  operation  such  that  0  pi  0  =  0,  0  pi  -  4 
=  ~6,  and  -a  pi  0  =  -a.  Equation  (6)  sates  that  we  will 
eventually  get  one  term  /G,. 

Example:  (Consider  four  /G.’s,  namely,  /G,{00— 2*  -2'(XX)0V 
/G,  {- 10  000  000)),  /G,(0 - 3*00 - 3*0  - 3*0),  and  IG,,  00000  - 
2*0  —  2*).  We  get  IG„  as  follows: 


IG,  0  0  -2'  -2'  0  0  0  0 

IG,  -10  0  0  0  0  0  0 

IG,  -I  0  -2‘  -2'  0  0  0  0 


C.I.L. 


J  ■ 


CASE  2  iCMB  for  deocnOent  group):  In  tins  case.  Oie  CMB 
operator  is  quite  involveo.  To  cSeiine  this  operator,  use  Steps  1 
and  2  beiow- 

Step  1:  DG<  «  DG-^  generates  two  cubes  TG<  and  TG-,- 
The  cube  TG.  has  in  those  ^  posraons  where  cuocs  DG- 
and  DGi  both  are  negative.  wniJe  others  arc  “0"s.  TG,  has  ■*!" 
in  all  ^  positions  and  its  remaining  entries  are  generated  from 
DG\  and  GG-r  using  0  pt  x  =  a  pi  0  =  €  <'  ~3' .  0  ■ 

Bemre  applying  “pi"  operanon,  update  variable  x  bv  adding  9  to 
X.  Note,  the  entries  in  TG’s  belong  to  {  -3‘ .  0.  1} 

Step  2;  Consider  TG.  =  i  /i  /j  ■  J,  )  and  DC,  =* 
{  ds  dt  ■  ■  d,  )\  where  /.  €  (-a-  .0.  l},d.  €  i  -«•  ,01  and 
,  S‘  €  {3'  )■  The  followmg  substeps  obtains  TG,  s 
for  (ail  DG.)  begu  /*  t  =  3....*,' 

for  (all  TG.)  begin  /*  ;  =  1 . *' 

caU  DG  (TG,,  DG..  TG'):  /*  TG'  is  the  result  • 
if  (terminate) 

;  =  J  +  1; 

end; 

TG  =  TG';  /‘we  create  new  TG,  list  *' 

end; 

Step  2  needs  a  procedure  OG  (  )  to  obtain  various  TG.  $ 
An  al^nihm  for  DG(TG,,  DG„  TG')  is  as  follows.  Note,  x 
malce  the  algorithm  easier  to  follow,  we  provide  c  ;  example  for 
each  case  (aHO  Boolean  expression  where  X  is  anv  Boolean 
expression. 

while  (TRUE)begui  /•  forever,  stop  when  lemuoatesTRUB  * 
delu  =  Number  of  (1,-^^)  pairs; 
alpha  =  Number  of  (-a^ ,  -6^)  pairs; 

/*  Consider  the  following  cases  of  (delta,  alpha)  *' 

(delu  ==  (?^. alpha  ==  don't  care)  :  r  case  a)  */ 
begin  !•  (abeX)  (abc)  =  d  */ 
terminate  =  TRUE;  retmii; 
end: 

(delu  >  0,  alpha  ==  don't  core)  :  /*  case  b)  */ 
begin  /*  (abX)(adede)  =  (oiXXcde)  */ 
for  ()b  =1  to  1)  begin 
if  (DG,[/fc]  ==  -6^)  begin 
ifrTG,[fcl  ==  1) 

DG,  (ifc|  =  0; 
else 

DG,[k]  =  DG,[k)  +  delu; 

end; 

end; 

end; 

(S  =ss  o.Q  ==  0)  :  /•  case  c)  •/ 
begin  /*  CMB  for  independent  gro^  */ 

TG,'  =  TG,  pi  DC,,  r  (abX){a£)=ahcdX  */ 
append  TG/  to  TG'; 
terminate  =  TRUE;  return; 
end: 

(delu  ===  0.  alpha  ==  :  /*  case  d)  •/ 

begin  /*  (obc)  {abX)  =  abX  */ 
append  TG,  to  TG', 
terminate  =  TRUE;  remm; 
end: 

(delu  ==  0,  alpha  ==  S')  :  /*  case  e)  */ 


begin  •  laoc.X)  iao<  =  aoX  “ 

TG.  =  DG,  pi  (the  rest  of  elements  m  TG,  omer  ina.-. 

-a  - ); 

append  TG,'  to  TG': 
terminate  =  TRUE;  return; 
end: 

OTHERWISE  :  •  f)  "  _ 

*  (ohe.Y)  iaoed)  =  abX  -  (ah?.’0(ai')  * 

•  note,  me  ftrsi  term  above  is  one  of  the  tinal  result  •• 

begin 

foni:  =  1  !0  /)  begin 
iflDG.ii]  ==  -6‘)  begin 
ifrTG.(i:)  ==  -a')  begin 
TG,'(ii]  =  -alpha; 

TG',[ifcl  =  1; 

DC.lfc)  =  0; 
end: 

else  begin 

TC./fc)  =  TG,[fc]; 

DG.ftj  =*  DG,[ik)  *■  alpha  -  delu; 
end: 
end; 

else  ifrTG.[i:|  ==  -o')  begin 
TG,  [fc]  =  0; 

TG,[lti  =  TG,[k]  alpha; 

«d; 

dsc 

TG,  '«]  =  TG,[jbj; 

e»d: 

append  TG.  to  TG'. 

md: 

end: 

Note,  cases  ai-O  are  oboined  bom  Theorems  A.1  and  ADL  ir 
the  Appendix.  A  proof  on  completeness  of  CMB  operator  is  aisc 
given  m  dK  Appendix. 

Ejcampte  Assume  DG.  {-3‘0-3‘-3‘000000),DG,  (-4’0- 
4'0  -  4>0000  -  4').  DG^  (-3*0  -  3*00  -  3*0000)  and  DG, 
(-4*00  -  4*0  -  4*  -  4*000).  where  DG,  €  DG  for  I  <  i  <4. 
Step  1.  We  generate  two  cuoes  TG\  and  TGj  as  follows; 

DG,  -3'  0  -3*  -3*  0  0  0  0  0  0 

DG,  0  0  0  0  0  0 

TG,  -2‘  0  -2'  0  0  0  0  0  0  0 

TG,  1  0  1  -1  -2*  0  0  0  0  -2^ 

Step  2.  Consider  TG,  *  DG,  and  TG,  *  DG,.  Usini 
procedure  DG  (  ),  TC,’  =  TG,  because  TG,  *  DG,  is  o 
case  (d)  (refer  to  the  procedure).  'The  TG,  *  DG,  is  computed 
as  follows: 

TG,  10  1  -I  -2*  0  0  0  0  -2* 

DG,  -3*  0  -3*  0  0  -3*  0  0  0  0  ;  case  b)  delu=2 

TG,  i  0  1  -1  -2*  0  0  0  0  -2*  ;  keep  TG, 

DG,  OOP  0  0  -1  0000;  update  DG, 

TCj  1  0  1  -1  -2*  -1  0  0  0  -2*  ;  case  c) 

Thus,  the  new  TG.'s  are:  TG,{-2‘0  -  2*00000001  and 
TG,(10l  -  1  -  2*  -  1000  -  2*).  They  are  further  CoMBine 
with  DG,. 


5<,rl  -OiL)  .-.Mr-', 


TGi  •  DG, 

rCi  -2'  0  -2'  0  0  0  0  a  0  0 

-t*  0  0  OOP,  case  t) 

rc;  -I  0  0  0  0  0  0  0  o  o  : 

oinains 

re;  1  0  -1  .3-  0  .3-  .3-  0  0  0  ■ 


TG'i  •  DGi  : 

TGi  I  0  1  -1  -2^  -1  0  0  0  -2^ 

DG,  -A*  0  0  -1*  0  -g*  -4*  0  0  0  :  case  b) 

TG-i  1  0  1  -1  -2^  -I  0  0  0  -2"  .  keep  TG, 

DGt  0  0  0  -j-  0  -3*  O*  Q  0  0  :  case  d) 

rC;  1  0  1  -l  -2^  -1  0  0  0  -2^  ;  tie  result 

We  get  three  TG.'s  ;  TGx  (-1000000000),  TC,  (10-1-3*0- 
3*  -  3*000).  and  TG,  ( 101  -  I  -  2*  -  1000  -  2*).  To  help  under¬ 
stand  these  steps,  ure  provide  a  step  by  step  computaoon  using 
Boolean  aotauon.  The  four  DG,  's  are  equivalent  to  acd,  act], 
acf,  and  aafg.  We  nave  assumed  that  a  DG,  is  a  function  of  “a” 
through  “j”  Boolean  variables.  The  CMB  operator  determines 

(oed)  (occj)  (oc/)  (od/p)  =  (as  +  oed  ej)  (oc/)  (od^p) 

=(aa  -  acd  ej  oc/  )  (od/p)  =:  (32  +  (oed  eJ)  (7) )  (od/p) 
=  (32  -r  oed  eJ  7)  (od/ff)  =  (32  od/p  -r  oed  ep  /  od/p) 

=s  a-i-  o2  ^  +  (oc3eJ7)  (d/p)  3=  a-r  oc  dfg-^acdTi  /• 

The  above  example  is  solved  using  following  Boolean  idenuties 

[201: 

(x  +  jf)(i  +  a)  s=  r  4-  5 z  ;  ?  -  =  i; 

55  =  (J  15)- 

4)  G£Af  /■  9  )  Operator:  Consider  the  path  identifier  F, 
(ot  oi  - '  Of ),  the  cubes  generated  from  CASE  1  and  CASE 
2  CMB  operanon  as  IG^  (ci  ej  •  •  <(),  and  TGk  (/i  h  fi)\ 
where  a,  €{0.  1},  Ci  €{  ~-i3' ,  0  },  and  /,  €  {  -3' ,  0, 
1  }.  The  GEN  (  9  )  operator,  then,  obtains  F^'s:  F’  ^  F,  ^ 
IG^  9  TGk-  As  an  example,  assume  cubes  F,,  IG^,  and  TGk- 
Then  F’  is  computed  as 


F,  0  011000000 

/G,  -2*  -2‘  0  0  0  0  -I  0  0  0 

TGk  0  0  0  0  1  -2^  0  -2^  I  0 

F’  -2*  -2‘  1  1  1  -2^  -I  -2^  1  0 

For  a  6,  €  F^  obtam  6,  =  a,  9  «,  9  /..  The  bitwise 
operanon  9  is  shown  below: 

00090=0 
10000= I 
0  0  -5'  ©0  =  -3' 

00001=1 
O0Q9-d'  =  -3‘- 

Here,  we  have  considered  only  those  five  (out  of  12)  combi¬ 
nations  which  are  feasible. 


Theorem  1:  The  comoinanons  '0.  1.  0)  and  i-  :  ).  r,  ne 

not  possiole. 

Proof:  Using  ;ne  definition  of  /C,.  it  is  easv  to  snow  t.nat  the 
occurrence  of  (0.  1.  0)  is  not  possible.  .Moreover.  F  rcpreser.Ls 
a  path  identifier  denned  over  {0.  !}■  Thus,  a  "-J'"  entry  ,n  F 
does  not  appear  with  F^. 

Theorem  2:  More  than  one  occurrence  of  1  or  -  J'  in  position 
i  for  the  cubes  (F,,  /C,,  TGk)  is  not  feasible. 

Proof:  A  multiple  occurrence  of  1  in  position  i  confirms  the 
presence  of  an  uncomplemented  variable  with  iF.  TGk)  com- 
binauon.  The  cubes  /C,  and  TGk  represent  disjoint  expression, 
and  are  generated  for  a  path  identifier  F^.  Thus,  the  possibility  of 
having  two  I’s  in  position  t  does  not  anse.  A  similar  argument 
follows  for  multiple  ~3'  or  (1,  -J')  combinauons.  Z 

rv.  Caael;  Algorithm  aM)  [mplemevtation 

A.  Algorithm 

The  steps  of  the  proposed  algonthm  are  shown  below. 

CAREL: 

begin 

Son  F,  m  ascending  cardinality,  and  for  terms  /’refer  to 
[12],  [15]  for  us  advantages*/ 

of  the  same  sum,  use  lexicographic  ordering; 

F‘  =  Fu 

for  all  paths  F,  begin  /*f  =  2,  •  •  •  */ 

COM  {F,S,)  ;  /*/  =  1.  •  •  1-1;  we  get  £,’s  ’/ 

RED  {£,£,)  ;  /•  return  irredundant  IG’s  and  DG's  *' 
CMB  (/G.,  DGi)  ;  /*  produce  IG„  and  JG,  */ 

GEN  (F.,  ;G„  TCi)  ;  /*  get  P’s  ’/ 
end: 

Compute  the  reliahility(unreliahility)  value  RfG)  {Q  (G)); 

end. 

Consider  COMl  and  REDl  (COM2  and  RED2)  operauons 
while  usmg  CAREL  wnh  P  I  (P  IT)  propositions.  Thus.  CAREL 
applies  equally  for  P  I  and  P  B.  To  compute  the  terminal 
reliability  (unreliability)  parameter,  we  have  developed  a  program 
called  rtjvum  which  accepts  the  output  from  GEN  (F,,  /G,, 
TG,).  For  given  value  of  link  reliability,  rejittm  produces 
a  numencai  value  for  the  tennmai  reliability.  One  may  use 
any  other  software  package  like  vaxtma  [18]  to  evaluate  the 
reliability  or  unreliability  expression.  Tables  QI-V  show  the 
reliability  figures  for  19  different  types  of  DCS  networks. 

B.  Implementation 

This  section  describes  an  implementation  of  CAREL  for 
proposition  P  (I.  It  is  based  on  bit  operations.  The  impiemeatation 
of  CAREL  P  f  follows  simiiariy,  and  will  not  be  discussed.  Both 
CAREL  P  I  and  P  H  are  written  in  C.  tn  what  follows,  we 
consider  four  macros  which  define  bit  operations.  The  cost  for 
each  macro  call  is  also  given. 

1)  SetUnion  (j1,  j2,  s)  ;  /’  j  =  si  U  s2  *'  cost 
assignment  operations. 

2)  SetDif  (al.a2.a)  ;  /•  s  *  si  -  s2  ’/  cost  :  [L'lel  FOR 
operations. 

3)  SetCompare  (al.  a2) ;  /*  return  TRUE  if  si  =*  s2  ”  cost 

:  <  ri/16]  i/  sutements. 

4)  SubSet  (si,  s2)  ;  /*  return  TRUE  if  si  C  s2  " 

cost  .  1  SetUnion  (  )  •►  1  SetCompare  (  )  calls. 

In  1)— i),  I  represents  the  number  of  links  of  the  network  and 
a  word  “’ji”  is  of  16  bits.  These  four  bit  operations  are  used  to 
unplement  CAREL  operators. 


!i  CG^f  ■  imDiemenhinon:  A  proctaurs  CDM  ,  F  i  ;s 
Lmpiemtntsa  is  roilows 

for  j  =  »  :o  ;  -  ; !  begin 

SetL'nion  iF,.  F, .  s):  ,*  s  is  temporary  result  *' 

SetDu  i  s,  F  F,  is  me  resuluappena  tt  to  F, '  list ' 

end: 

Note.  :ne  numoer  of  COM  t  )  funcuon  cads  is  1  ~  2  -  — 

;n  -  li  =  '  tunes  tn  a  networtt  with  n  pains  (cuts). 

2)  RED  ‘  :  Impiementanon:  In  COM  {  )  operauon.  we 

proouce  F,  .refer  :o  Section  III-Bl).  The  and  bit  opcrauons 
oOtaui  non  reaundant  MCC's.  The  impiemenuuon  of  RED 
operator  is  snown  oeiow: 

for  ail  F.  's  be^  *  i=I.  ■■  *' 

if  (SuoSettF,'.  £*')  )  /*  £*'  is  REDuable  by  £,'  *' 
dispose  £(,':  oreaJc; 

else  if  fSubSet(£k\  £,')  )  /•  £,'  is  REDuable  by  EF  " 
dispose  £,'; 

else 

create  IG  and  DC  groups; 

end; 

if  £»'  was  not  disposed 
add  EF  to  £,'  list; 

Note,  the  nonredundant  £,”$  an  in  bit  form  (having  O’s  and 
I's  only).  The  MCC  £, 's  are  generated  from  E,"i  by  repiacing 
1  by  tn  the  positions  of  1,  where  a  ^  3'  (refer  to  Secnon 
ni'B).  The  number  of  RED  (  }  funedoo  eaiu  is  n  tunes  in  a 
network  wtth  n  paths  (cuts).  The  number  of  loopings  inside  the 
RED  (  )  funenon  depends  on  the  network  type,  and  also  on  the 
path  identifier  F,  for  which  it  is  called.  For  the  worst  case,  R£D( 
)  for  F,  needs  t  loopmgs.  and  hence  the  computadon  of  RED 
operator  is  of  the  order  O(ti’). 

3)  CMB  (V  Impiemenumon:  The  CMB  (•)  operator  is  the  most 
dme  coosunung  operator  out  of  all  the  four  opeiaion  we  have 
used  in  our  method.  The  implementanon  of  CMB  for  independent 
group  is  straightforward,  and  is  shown  first: 

for  (all  ICi)  begin  /•  «  =  1,  •  •  •fj  -  1  •/ 
for  0  =  1  w  0  begin 
ilf/G.L;]  0) 

/G...L;)  =  /G.Lli; 

end; 

end; 

The  unpiementanon  of  CMB  tor  dependent  group  is  expensive. 
Note,  a  DG.  coouins  i-3' .  0, 1);  we  cannot  utilize  bit  operaaons 
for  CMB  operator.  To  update  the  contents  of  the  CMB  operands. 
DG,  and  TG,,  as  well  as  to  generate  TC.',  we  trace  the  contents 
of  the  operands  element  by  element  The  compuuuoo  time  is 
highly  data  dependent  But  in  any  case  ( refer  to  cases  (a)  through 
(f)  in  the  procedure  considered  in  Secdon  111-83),  the  order  of 
computation  tune  is  0(1).  Note,  m  our  program.  Step  1  of  the 
algonthm  falls  mto  case  (f).  The  maximum  number  of  DG,  's 
is  k  for  generating  e.m.d.  event(s)  for  Fk,  and  the  number  of 
generated  TG,  is  0(k).  Hence,  the  woist  case  cost  for  calling 
CMBf  )  IS  Oflr}  of  DG  (  )  calls. 

4)  GE2^  (  9  )  Impiememaaan:  This  operator  ts  processed  by 
seCjUenQaily  tracing  the  contents  of  F,  and  IG^  for  geoeradng 
P{%)  events.  The  procedure  GEN(F,,  /G,,  TGk)  unpiements 
®  operator 

for  (ail  TGk's)  begin  /*k  =  1,  •  •  ■*/ 
for  (j  =  I  to  ()  begin 

if  (F.ijj  ^  0) 

^"j]  = 

else  if  (/G,[;|  ^  0) 

F-b!  =  /Gwi: 


else 


end: 

Note,  ror  a  oath  idenufier  F.  we  get  oisioiDi  evecttsi  Fne 
time  comoiexitv  mvoivco  tn  GE.N  (  )  for  a  nerworit  ctceccs  or 
the  numoer  of  tnc  generatea  e.m.a.  events  .-  >  It  '-.re  -nax.mum 
numoer  of  e.m.a.  events  is  m.  the  worst  case  comoiexitv  ;s  or 
the  order  Oimm. 

C.  lUustranng  Euimpies 

Example:  Consider  Fig.  L  with  {$.  :)  pates  ao,  cd.  ade.  and 
bee.  Paths  arc  encoded  as  path  identifiers  F'aii  <  ;  <  -ij,  ano 
are  sorted  in  theu  ascending  cardinality  as: 

F;  (XXX)000000000011  F,  00(X)0000(X)01100l 
Fi  (XXXXKXJOOOOOllOO  F,  00(XX)00000010110. 

Following  the  algonthm  given  m  Secuon  IV- a,  =  F.  To 
generate  e.m.d.  eveatls)  for  Fj,  we  start  wuh  COMfFj.F.) 
and  COMfFj.Fj)  which  give  £i'(00000(xxx)00000l0)  and 
El'  (0000000000000100)  respecuvely.  RED(£,)  retains  bom 
£'’s,  and  groups  them  mto  IG  with  an  empty  DG  group. 
CMB  operation  obtains  IG.,  as  0000000000000- i- 10.  Finally, 
GENfFj  ,IG„  ,DG)  gives  ^  as  (00000(X)000011-l-ll),  which 
can  be  interpreted  in  an  m'ermediate  form  as  ade(l  -  6X1  -  c). 
Note,  this  resultant  expression  has  one  to  one  correspondent  with 
the  probability  expression.  Similarly,  we  obtain  F’  and  FV  The 
various  e.m.d.  events  are: 

F(disjoint)  s=  a6-i-(l  ~  ad)cd ade(  1  -6)(1  -c)-^6ce(l  - 
a)(l  -  d) 

Example:  Consider  Fig.  4  and  its  13  path  idenofiera  as 

F,  (OOOOOOOlOOlOOOlO)  F,  (OOOOOOOlOOOlOOOl) 

Fj  (0000000010001001)  F,  (0000000011100010) 

Fj  (OOOOOOOOIOOOUIO)  F,  (OOOOOOOlOOOlOllO) 

Fr  (0000000011010001)  Ft  (0000000100100101) 

F,  (0000000101001001)  F,9  (0000000101001110) 

Fu  (0000000011010110)  Fu  (OOOOOOOOlOlllOlO) 

Fij  (0000000011100101). 

The  e.m.d.  event(s)  for  F«  is  generated  usmg  the  steps  menaooed 
in  the  algonthm.  The  details  are  shown  below.  For  1  <  t  <  T, 
COM(F|J%)  obtains 

£,'  (0000000000000010),  £,'  (0000000000010000). 

£,'  (0000000010001000),£,'  (000000001 100001 0), 

£,'  (0000000010001010),  £,'  (0000000000010010). 
and  £/  (OOOOOOOOl  1010000). 

The  RED  operanon.  then,  removes  £,’  through  £•  .  because 
these  terms  are  redundant  with  respea  to  eimer  £  or  £;' 
The  RED  also  classifies  the  nonredundant  terms  into  IG 
and  an  empty  OG  set.  The  CMB  operator  obtains  IG,  as 
0000000-2*00-1—2*0-10.  Finally  GEN  operatoi  generates  F* 
as  (00000001-2*01-1-2*1-11). 

Example:  This  example  shows  the  generation  of  e.m.d. 
eveai(s)  for  Ft  and.  thus,  illustrates  the  concept  of  OG's.  For 
1  <  t  <  8.  COM(FiJ^,)  obtains 
£,'  (OOOOOOOOOOlOOOlO),  El'  (000000000001 0000). 

£3’  (0000000010000000),  £F  (0000000010100010). 

£s'  (0000000010000110),  £,'  (00  OtX)0000(X)10110). 

£/  (0000000010010000),  and  £,'  (OXIOOOOOOOIOOIOO). 

£4'  through  £t'  are  REDuccd.  The  RED  procedure  lists  £/  and 
£3'  wim  IG.  while  £|'  and  £,'  are  kern  m  DG  set.  The  CMB 
operauon  obtains  /G,  as  00000000-100*10000.  The  elements  to 


>0H  A>0  R.A,i  C.-VKEI-  CjMPLHiR  UDtD  REURUii-T^  E^Ai.L^rUK 


s 


Fig.  i  6-node,  i-linit  aetwort 


Fig.  3.  S-oode,  S-linlc  Network 


DG  group  are  used  to  generate  TG\  and  TG-i  (in  step  1)  as 
follows; 

DG^  0000000000  -2‘0  0  0  -2*0 
DGt  QQQQOOOQQO  -2^0  0  -2^0  0 
TG,  0000000000  -1  0  0  0  0  0 
TGj  0000000000  1  0  0  -1-1  0 

Since  we  do  not  have  any  other  DG,  's.  Step  2  of  CMB  operation 
is  slapped.  Terms  TG)  and  TGt  are  the  final  results  for  CASE 
2  (of  CMB).  Hence,  from  GEN  operator,  we  get  two  e.m.d. 
events  for  F^,  namely  (OOOOOOOl-ll-l-llOOl)  and  (00000001- 
111-11-1-11).  An  expression  for  F(disjoint)  of  Hg.  4  is  given 
as 

bfi  +•  o«t(l  -  bf)  adh((l  - »)  +■  t(l  -  b/)(i  -  e)) 

-  bfghll  -  ad)(l  -  »).+  bcdh(l  -a)({l  -  /} 

-  /(I  -  i)(l  -  g))  +  bcei(l  -  a)(l  -  dh){l  -  f) 

-  aegMl  -  i>/)(l  -  (i)(l  -  i)  -r  oc/t(l  -  h)(l  -  (ih)(l  -  e) 
-adgt((l-«)(l-/)(L-/i) 
^(l-6){l-c)(l--)/(l-li)) 

-r  bcdgtil  -  a)(l  -  «)(1  -  /)(1  -  k) 

^  bceghd  -  o)(l  -  d){l  -  /)(!  -  i) 

^bdefhd  -  a){l  -  c)(l  -  g){l  -  i) 

-r  acfghd  -  6)(1  -  d){l  -  e){l  -  i). 

Assuming  that  each  link  has  0.9  as  prcbability  of  survivability, 
the  terminal  reliability  for  the  DCS  (Fig.  4)  is  0.977814. 

V.  Expqumental  Results  and  Discussjon 

Table  Q  gives  us  the  informatioa  of  the  networks  used  in  our 
experiments.  We  name  a  network  where  #  paths  and  # 

cuts  are  the  number  of  minpaths  and  mincuts,  respectively.  As 
shown.  Figs.  1-^  possesses  less  than  20  paths,  while  there  are 
281  and  7g0  paths  with  the  DCS  given  in  Figs.  17  and  18.  and 
19,  respectively.  Tables  m-V  provide  results  for  19  different 
types  of  DCS  networks.  We  have  considered  two  performance 
parameters  as 

i)  The  number  of  disjoint  terms  generated  (DPath).  and 
li)  The  computer  time  involved  to  obtain  reliability  figure. 
This  includes  processing  Pme  for  i)  and  also  the  reliability 


Fig  5.  AWANET  in  1971. 


Tig  6.  7-oode.  IS-iink  nerwod. 


Fig  7.  Il-acde.  Zl-Uak  network. 


or  unreliability  expression  evaluation  time  using  rejuum  or 
va^ma  (18]. 

First,  the  performance  (refer  to  Table  DT)  of  CAREL  (  with 
P  I  opaon  )  is  compared  with  that  of  VT  (18),  a  rtpresenunve 
method  for  propositioo  P  m.  The.  reliability  values  obtained  m 
both  CAREL  and  VT  (18)  are  exaoly  the  same.  CAREL  fwith  P 
I  option)  obtains  the  same  order  of  e.m.d.  events  as  m  (I8|.  The 
number  of  e.m.d.  evena  geacmed  is  tnfiuenced  by  tbe  ordering 
of  tbe  nunpaths/mincuts  [D].  When  we  scrambled  the  ordering 
of  tbe  paths,  we  got  more  (less)  number  of  events  than  the  ones 
reported  in  the  table  with  exactly  the  same  reliability  values. 
We  QOdce  that  when  a  network  gencraies  less  number  of  e.m.d. 
events,  it  requires  less  compuunoo  ume.  From  this  observation, 
we  believe  that  preprocessed  minpaths/muiaits  (  in  addinon  to 
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soroBg  them  m  asceadiag  ordinality  as  suggested  m  [12]),  is 
neecea  to  further  improve  the  performances  of  existing  methods. 
The  results  in  Table  III  show  the  supcnonty  of  CAREL  (with 
P  I  opaon)  as  compared  to  VT  [18},  a  represenuDve  lechmque 
of  oroposmon  P  HI.  The  running  tune  improvement  m  CAREL 
is  more  noticeable  when  we  evaluate  larger  networks  (Figs.  16 
and  IS).  The  results  support  the  discussion  in  Secuon  Q-C.  and 
analysis  in  [1].  Note,  COMl  and  REDl  (COM2  and  RED2)  is 
used  Its  CAREL  with  P  I  (P  H)  opnon.  Thus.  CAREL  P  I  and 
CAREL  P  Q  differ  only  in  generating  the  minimal  conditional 
set  (MCQ  E, lAe  other  methods  in  proposition  P  (I.  CAREL 
P  H  obtains  the  MCC's  from  tmnpath/mincut  of  a  network 
(4),  while  CAREL  P  I  incorporates  (2)  (Secaon  H-B).  The 
COMl.  m  addihon  to  generating  condiaooal  set  reduces  some 
of  the  redundant  terms  too.  Thus.  REDl  opetanon  in  CAREL 
P  [  utilizes  less  numoer  of  operands  than  that  used  tn  RED2. 
However.  COMl  operator  nniiT>^  more  terms  than  COM2  (refer 
to  Secnon  H-C).  Bit  opcraaons  are  used  to  implement  CAREL 
P  n  and  IS  one  addioonai  factor  which  makes  this  method  run 
faster  than  CAREL  P  I.  The  other  problem  with  CAREL  P  I 
method  is  on  the  memory  size  used.  To  incorporate  (2),  this 
method  has  to  mainuun  a  list  of  e.m.d.  events.  For  a  large 
network  which  generates  more  than  50  000  events  (Table  IV). 
tbe  program  repuires  a  huge  memory  space  that  is  availanle  only 


on  large  system.  Furthermore,  the  huge  number  of  events  reduce* 
the  speed  of  CAREL  P  I  significantly.  (3AREL  P  H.  as  well  a* 
CAREL  P  I.  requires  a  list  of  mmpath/nuncuL  Since  out  pat 
/  cut  in  a  network  which  contains  i  links  needs  *  t/u  ^  wot 
for  lu  word  size,  this  requirement  docs  not  prevent  CARET.  T 
n  of  solving  large  distnbutcd  system.  Both  vanants  of  CARE’ 
use  the  same  CMB  and  GEN  operatots  for  thetr  OP,  and  OP 
In  CAREL  P  0,  generanng  e.iTui  eventts)  of  a  path  idenofic 
F,  IS  mdependeni  from  other  e.m.d  events  obuined  for  cthe 
path  tdcmifien  F,  's.  From  this  observanou  we  nooce  that  mm 
changes  tn  CARET-  P  H  sviU  make  it  quite  suitable  for  parall 
system  tmplemenianon.  Overall.  CAREL  P  D  is  a  bener  memo 
oomDarcd  to  CAREL  P  I. 

Table  IV  shows  the  comoansons  of  CAREL  P  1  and  CAREL 
n  m  term  of  e.m.d.  events  generated,  the  reliabilitv  values,  i. 
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Fig.  10. 


8*aode.  12-link  network  (Fig.  9)  with  differeai  sotirce. 
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Fig.  11.  l-vaOt,  12-link  network. 


Fig.  12.  S-QOde,  13-iink  netwotk  [18). 


Fig.  13.  16>oode.  30-link  network  (14). 
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the  runaing  time  to  obtain  these  values.  Both  methods  get  the 
same  t.m.d.  events  and  reliability  values.  However.  CAREL  P 
Q  obtains  the  results  faster  than  CARH.  P  I  as  we  can  see  from 
the  table.  Furthermore,  as  expeaed  from  our  previous  discussion, 
CAREL  P  I  has  problem  evaluating  large  networks  (e.g..  Fig. 
19).  The  results  show  that  proposition  P  Q  is  better  than  P  I, 
which  back  up  our  previous  discussion  in  Section  O-C.  We.  thus, 
conclude  that  CAREL  P  n  (  impfemenution  of  propositioa  P  H) 


is  a  bener  method  compared  to  both  CAREL  P  I  (proposition  ? 
0  and  VT  [18|  (propostuon  P  HD. 

CAREL  P  11  IS  more  effiaeat  compared  with  5YREL  [Ij. 
SYREL  needs  0.8  s  (on  VAX  11/T50)  to  get  ;lie  rerminai 
reliability  value  for  network  shown  in  Fig.  9.  while  C.AREL  P 
n  uses  only  0.1  s  (on  Encore  Multimax).  CAREL  P  11.  while 
keepmg  the  aigonthm  efficient,  produces  less  number  of  disjoint 
events  expressions  compared  to  SYREL  Our  (AREL  cube 
Qotauon  enables  the  aigonthm  to  produce  concise  expression, 
hence  less  number  of  events  that  reduce  the  running  time  of  the 
aigonthm.  For  example,  SYREL  produces  disjoint  events  ^  as 
adh  (i  -r  bet  1-  /I  bi),  while  CAREL  generates  aoh  (:-?{>/  i). 
which  contain  3  and  2  terms,  respecovely.  We  do  not  provide 
other  comparison  results  of  CAREL  and  SYREL  [1]  because 
SYREL  does  not  report  any  results  for  larger  sizea  aetworlcs. 
Since  CAREL  P  0  basically  combines  the  best  feature  of  SYREL 
[1]  (i.e..  using  .MCC)  and  take  into  account  the  advanuge  of  using 
variable  groupings  of  [3],  we  expea  CAREL  P  11  to  outperform 
both  SYREL  [1]  and  [3]  as  verified  by  our  expenmentai  results 

The  compansons  of  evaluating  reliability  and  unreiiabiiirv 
values  of  distnbuted  system  networks  arc  shown  in  Table  V.  Wc 
use  the  same  networks  as  m  Table  II,  and  utilize  CAREL  with  P 
n  opaon.  The  unreliability  values  arc  obtained  from  the  ‘.m.d. 
evenis  of  ouncuts  of  the  networks.  We  generate  mincuis  from 
tninpaihs  using  a  sunpie  program  based  on  a  method  discussed 
in  [2].  However,  one  may  use  other  methods  for  this  sieo.  in 
most  DCS  networks  (shown  here),  we  obtain  more  nutnoer  of 
cuts  than  that  of  paths.  Thus,  evaluating  the  reliability  values 
is  faster  than  compuung  the  unreliability  values.  For  nerworics 
which  have  less  number  of  cuts  than  paths,  die  opposite  .s  true. 
The  table  shows  that  the  sum  of  the  reliability  and  unrehaDiiiiv 
figure  of  a  network  is  always  1  (as  expeaed);  however,  rouDOing- 
o£f  of  10  000  or  more  terms  produces  a  little  quantization 
error  for  large  networks.  The  values  obtained  funner  suopon 
the  correctness  of  oux  proposed  aigonthm.  Hence,  reliatnlitv 
(unreliaoiiiry)  value  of  a  network  can  be  obtained  from  ’he 
computed  unreliability  (reliability).  The  results  show  that  CAREL 
is  capable  of  evaluating  both  reliability  and  unieiiabiiiiy  values 
of  large  distnbuted  system  networks. 

VI.  Conclusion 

We  have  proposed  an  efficient  aigonthm  called  CAREL  wmeh 
comnutes  the  terminal  reliability  or  unreliability  of  moderate 
to  large  sized  DCS  networks  with  modest  memory  ana  :ime 
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requirements.  The  aJgontJun  has  been  implemented  in  C  and 
was  run  on  an  Encore  NfULTIMAX  320  system.  The  perfor* 
mance  of  a  program  is  usually  based  on  the  algonthm.  the 
dau  structure  and  the  language  used,  the  computer  on  which 
the  program  is  run,  and  last  but  not  the  least,  the  coding  of 
the  program  [8].  Since  different  programmers  produce  diSerent 
codings  for  an  algorithm,  the  human  faaor  (in  want  of  sufBcient 
data)  is  inappropriate  while  comparing  various  techniques.  A 
better  implementadon  or  £uter  machine  would  increase  the 
performance  of  a  program,  but  only  to  a  ftctor  of  10  [11]. 
Moreover,  all  methods  of  reliability  compuuaon  are  known  to  be 
compuanonally  intractable  or  NP'hard,  which  makes  dif&cult  to 
compare  the  techniques  from  the  aspea  of  complexity  [2],  [9], 
[10],  [26],  [27],  G\REL  is  faster  than  other  existing  Boolean 
algorithms.  This  is  obvious  from  the  CPU  ome  requirement 
for  solvmg  the  terminal  reliabilityAinreliabiliiy  of  vinous  DCS 
networks.  Note,  the  CAREL  combines  the  advantages  offered 
both  by  SYREL  [1]  and  the  method  given  in  [3].  Presently,  we 
are  utilizing  CAREL  to  help  compute  the  reliability  issues  of 
multiprocessor  system  [6],  [28],  [29],  [30j.  An  earlier  version 
of  CAREL  has  successfully  been  applied  u>  solve  reliabilny 
problems  m  one  type  of  redundant  path  MIN  [19j. 
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V.  Appendix 

Provwc  Correctness  of  CAREL 

The  CAREL  uses  four  operators,  namely  COM.  RED.  CMB 
and  GEN  to  cnnsform  an  expression  of  paths  {or  cuts)  imc 
an  equivalent  exclusive  and  mutually  disiomt  (e.m.d.)  expres¬ 
sion.  The  COM  operator  defines  conditional  cubes  E.'s  for  a 
F„  while  RED  removes  redundant  £,’s  to  provide  mimma 
conditional  cube  (MCC).  The  opcranon  CMB  combmes  MCC 
to  generate  disjointing  terms  (l^s).  The  DT's  are  muiuai! 
disjoinL  Moreover,  the  F,  and  its  DTs  (refer  to  GEN  operatoi 
form  expression  which  is  dis)OUit  with  all  other  terms  m  (1; 
In  what  follows,  we  discuss  that  the  CAREL  always  generate  ^ 
c.m.d.  terms  (DT’s)  for  an  F,. 

Section  E-Bl  describes  COM  operator.  A  condiuonal  cub 
Ej  considen  “o"  elements  of  F,  which  are  not  present  m  F, .  Tt 
represent  E,  DOWN  (refer  to  Section  III-A).  our  notations  us^ 
-a‘  in  the  positions  of  the  variable  (here  the  index  I  is  eauaJ  t 
;).  (Considering  CAREL  (Secnon  IV- A),  it  is  obvious  that  CO? 
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table  IV 

COMPARISON  OF  CAREL  USING  P  I  AND  P  11  FOR  CPU  nme‘  aNO 
DisJOWT  Paths 
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RICM 
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Time  1$) 

OPitb 

Time  (j) 

!  Xi 

4 

0.0 

4 

0.0 

0.9784*0 

.V. 

0.0 

7 

0.0 

0.968425 

:  yi 

11 

0.. 

0.0 

0.997632 

■v?a 

16 

0.0 

16 

0,0 

0.977184 

.v-»  ;  15 

0.1 

15 

0.0 

0.964855 

.v.>» 

:3 

0.1 

n 

0.1 

_ 

0.996665 

.vao 

94 

0.2 

94 

mm 

0.994076 

30 

O.l 

30 

0.1 

0.969112 

.Vi» 

39 

0.1 

39 

■D 

0.975 116 

.V» 

34 

0.1 

34 

■n 

0.984068 

50 

0.1 

50 

0.1 

0.997494 

•  'S 

76 

0.1 

76 

0.1 

0.996217 

\fm 

S42 

542 

OJ 

0.997186 

vsa 

105 

oa 

105 

0.1 

0.904ST7 

■x^ 

37 

0.2 

87 

0.1 

0.974145 

xs 

309 

OJ 

309 

0.2 

0.997506 

V‘*W 

2491 

4.2 

2491 

IJ 

0.985928 

mm 

2386 

3.6 

2386 

0.987390 

.vw» 

•  «V 

54032 

3.4 

l)  Ruo  on  u  Encore  Muitimu  sysum. 
'  Usic  fUiUbiliry  s  0.9. 

••••  RejuiB  not  tanown. 

Tuae  a  in  CPU  Mcon<l&. 


obtains  ail  possible  E,'s  for  an  F,, 

Lemma  A. I:  Assume  two  condidonaJ  cubes  E,  and  of  F,. 
HE,  C  Ek,  iben  Ek  is  reaundant.  G 
The  RED  operation  implements  Lemma  A.J  and  is  performed 
for  ail  (£,,  Ek)  pairs.  Note,  the  definition  in  Section  QI-B2 
checlcs  out  Ti  =  J.  and  HCIy)  —  I  type  redundanacs.  The 
oonredundant  £,s  form  MCC.  Besides  removing  redundancies, 
the  RED  operator  partitions  the  MCC  into  IG’s  and  DG’s  (refer 
to  Section  Ul-BSy  Thus,  the  RED  speeds  up  the  compuuoon 
time  of  the  CAREL.  The  CMB  operator  for  independent  group 
(IG)  uses  J  •  y  =  ?  V  iteratively.  As  is  obvious  from  Theorem 
A.1  (discussed  later),  J  •  y  is  a  special  case  of  X^'  •  X^ .  Here, 
we  assume  that  no  common  elements  are  present  with  x  and  y.  (x 
and  y  are  independent.)  The  CMB  operator  for  dependent  group 
(DG)  utilizes  lemmas  and  theorems  menboned  below. 

Definmon:  An  X,  represents  a  cube  which  could  be  an  MCC 
£,.  or  the  one  produced  during  CMB  operation.  For  noutional 


table  V 

COMPARISON  OF  EVAiUATtNO  REUABILTrY  OF  A  SETWCRK  FROM 
PATHS  AND  CUTS  USING  CAREL  ;P  !I)‘ 
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0-997506 
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convenience,  consider  Xf  =  .T.  iX,),  when  c  =  1  (0). 

Lanma  A.2:  Assume  77,2^,  •,  17  represent  k  partitions  of 

X.. 

Then.  X^  =  f  T?  17  ■  •  ■  TJ  )' 

r(7727  -  ?;) 

Prvof:  For  c  =  1.  the  result  is  obvious.  With  c  =  0, 
DeMorgan's  complemenunon  law  is  rewntten  so  that  the  list 
of  T/’s  u  collectively  exhaustive.  □ 

Theorem a.L:  Consider  I,  L  (J.  K)  »s  2-parations  of  ,T,;  .T.', 
We  have  .C  mX"’  =  (/L)''(J/0'>.  The  terms  ilL'r'JKY:  is 
1)  X^' X",’ ,  when  .T,  fl  X,  =  p.  i.e.JC,  and  X,  are 
independent 


'A  V, 


EHE  'HA.NbACTT 


2)  A.  '*'nen  A'.  .V,  =  2.  i.e..  2  =  J  Thus,  2-=  J 

reorescnts  a  common  lerm  oetween  X,  ind  ,’C,. 

For  various  comoinations  oi  c,  and  c.,  "A"  is  ootained  as 
il  Case  'c,  =  c,  =  11.  .4  =  UK 

ii)  Cose  2;  [o,  =  liO),  C:  =  OdiJ. 

c  -I.K  ~  o 

X'JK^  J.K^o 

ml  Case  3:  ;c,  =  =  OJ.  A  =  .,  7, 

Proof:  For  .t,  H  X,  =  o.X:'  •  X/  =  X‘' X';^  is  stnught- 
forwarc.  Use  Lemma  A.2  to  show  the  results  in  Cases  1-3. 
With  c,  =  c.  =  i  and  L  =  1,  Case  1  is  obvious.  For 
c,  =  1  c,  =  O-.r-  •  A-;’  IS  ilJ)  (777)  or  (IS)  (7.7  77).  A 
result  :7,  7  7)  UfO  follows  similarly  when  c,  =  0.  c,.  =  1. 
Thus.  Case  2  of  Theorem  A.i  is  proved.  For  c,  =  c,  = 
0.7C7  ■  X/  =  iTJ)  (7A^.  Using  Lemma  A.C,  we  mterpret 
this  result  as  (7. 7  7)(7.  7  77)  which  after  applying  a  Boolean 
idenuty  [20]  produces  (7,  77  77).G 
Lemma  A.3:  Note,  X*0  =  <O*X=<p,  where  $  is  a  null  set, 
and  X  represents  any  term. 

Proof:  Using  Theorem  A.1.  the  proof  is  obvious.  G 
Theorem  A.2:  For  X^X^.X^,  the  CMB  operator  produces 
Xi  Xi  •  X)  =  G Fill.  Mil)  where  G.  F  (/,  E)  represent 
2-partiaons  for  X\  (Xi),  and  F,  H,  J  m  3-partitions  for  Xj. 

Proof:  Note.  X\Xi  is  a  term  obtamed  considering  Xf'  rX‘' 
in  Theorem  A.1  and  represent  munially  independent  terms.  i.e., 
Xi  and  Xi  will  have  no  term  in  common.  Using  Lemma  A.1, 
(XiXl  •  ^)  IS  shown  to  be  equal  to  (G  f  15  FHJ).  Theorem 
A-2  is  proved  after  applying  Lemma  A.1  □ 

From  Theorems  A.1  and  A.1  it  is  dear  that  if  we  CMB 
k  number  of  X’.'s,  we  may  generate  a  term  of  the  type  .Yj®’ 
XP  ■  ■  An  Iterative  application  of  these  theorems  solves 

(xr .rr xr  •  •  •  (xr xr) «» ( ixi'  *xp)*xr* 

. . .)  •  .  Note,  a  CMB  obtains  e.m.d.  events  (we  have  called 

them  as  DTs]. 

Finri'y,  0^  combines  the  F,  with  the  disjoinong  terms  DHTs. 
The  operator  utilizes  five  out  of  12  possible  combinations  for 
(-‘3‘ ,  0.  1)  alphabets  and  is  demonstrated  to  be  complete  (refer 
to  the  text).  Hence,  the  algorithm  CAREL  is  proved  to  be  correct. 
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Summary  S  Conclusioiu  —  The  paper  presents  a  computer 
approach  to  obuun  a  surmability  index  called  capacity  related 
reliability  (CRRl  in  large  telecommunicatioa  networks  where  links 
have  dilTerent  capacities.  The  proposed  method  is  a  2-step  ap¬ 
proach.  Sup  1  deals  with  oomposiu  path  enumeration  (CPE).  A 
k-cofflposiu  path  is  defined  as  the  union  of  the  set  of  edges  in  any 
k  simple  paths  and  relates  link  capacity  and  network  connectivity. 
Tlie  (n*£  approach,  presented  in  the  text,  a  an  improvement  over 
the  algorithms  of  [1].  Sup  2  manipulates  k-compodu  paths  infor- 
madon  to  gcnerau  the  CRR.  The  paper  nses  CAREL  (4]  to  solve 
this  sup.  However,  any  existing  techniques  based  on  Boolean  con¬ 
cept,  probability  theory,  indusioii-exchisioa  principie,  etc,  [2-T] 
may  also  be  utilized.  The  technique  is  automated  using  C  on  En¬ 
core  Multimaz  System.  The  results  oo  CRR  for  three  networks  with 
vanous  values  of  minimum  message  capacity  are  presented  in 
tables.  An  exhaustive  technique  Is  used  to  verify  these  results. 
However,  an  informal  proof  of  the  CPE  approach  is  also  includ¬ 
ed.  Appendix  A  provides  the  impiemenution  details  of  the  tech¬ 
nique.  We  have  given  counter  examples  (refer  to  Appendix  B)  for 
the  algorithms  of  [10,  13]  to  show  that  these  methods  lead  to  in¬ 
correct  conciusions  under  certain  situations. 


1.  INTRODUCTION 

To  obtain  ihe  survivability  index  of  a  large  teleconununica- 
tion  network,  :he  network  is  modeled  as  a  probabilistic  graph 
G(  V.£) .  nodes  {  V)  identify  communicating  centers,  and  links 
(£)  represent  connecnon  services.  Various  measures  for  sur¬ 
vivability  index  are  presented  in  the  literature  [1-17],  They  are 
characterized  by  dilTerent  operanonal  environments  (OE)  in 

This  work  has  been  partially  supported  by  the  US  Air  Force 
Office  of  Scientific  Research  under  grant  AjFOSR-91-(X)25  A 
Preliminary  venion  of  this  paper  was  presented  at  the  199 1  An¬ 
nual  Reliability  and  Maintainability  Symposium  {!]. 


which  the  network  carries  out  its  desired  operation.  As  an  ex¬ 
ample.  m  message  switching  or  virtuai  cirmt  packet  switching 
network,  we  need  to  establish  a  node  to  node  connecnon  in 
G{V.E).  As  long  as  this  connectedness’  property  of  the  net¬ 
work  is  maintained,  one  may  successfully  traasmii  messages 
(packets)  through  the  network  even  if  some  of  the  nodes  and/'or 
links  m  G(  V.E)  fail. 

A  network  is  generally  validated  for  the  conncaedness  or 
Its  OE  by  enumerating  simple  paths  between  ail  node  pairs.  In 
such  situations,  link  capaaty  is  often  ignored  or  implicitly 
assumed  to  be  equal  and  also  Large  enough  to  sustain  transmis¬ 
sion  of  messages  (packets)  of  any  bandwidth  (size).  This 
assumption  is  unrealistic.  The  link  capaaty  is  a  function  of  cost, 
and  is  limited.  Each  link  in  G(  V,E)  may  have  different  capaci¬ 
ty.  Moreover,  there  may  be  a  minimum  message  capacity 
(Wjbb)  requirement  through  the  network.  Obviously,  the 
measure  of  connectedness  usmg  simple  paths  is  not  enough  to 
validate  this  form  of  OE.  This  paper  defines  the  concept  of  a 
/(;-composite  path  which  captures  the  efiea  of  message  band¬ 
width.  link  capaaty,  and  network  connectivity.  Note,  a  com¬ 
posite  path  satisfying  an  OE.  where  requisue  amount  of  message 
bandwidth  is  also  made  available  through  the  network  bavmg 
heterogenous  iink-capaciaes,  is  a  success  state  of  the  network. 

Recently,  a  few  researchers  have  addressed  the  problem 
of  capacity  related  reliability  (CRR),  or  combining  link  capaaty 
with  terminal  reliability.  Douiliez  &  JamouUe  [8]  have  applied 
the  decomposition  principle  to  calculate  the  system  reliability. 
They  decompose  the  whole  state  space  into  three  categories; 
a  set  of  functiomng  states,  a  set  of  failed  states,  and  a  set  of 
undetermined  states.  Each  set  of  undetetnuned  states  is  again 
decomposed  into  three  categories,  and  so  forth  until  the  set  of 
undetermined  states  is  null.  This  implies  keeping  track  of 
numerous  sets  of  undetermined  states  as  well  as  the  relevant 
upper  and  lower  limiting  states  of  each  set.  Hence  it  requires 
large  memory  sizes  for  large  systems.  Lee's  technique  [9]  uses 
a  labelling  scheme  to  route  the  flow  through  the  network.  It 
IS.  however,  suited  to  acyclic  graphs  only.  Its  adaptation  to 
mixed  and  cyclic  graph  unposes  a  problem  because  of  the  ex¬ 
istence  of  feedback  [an  unsuitable  situation  for  the  labelling 
scheme).  Misra  ic  Prasad  [10]  utilize  a  failure  path  list  to 
enumerate  a  term  like  composite  path  defined  in  the  text.  The 
success  of  a  2-composite  path  is  tested  against  a  given 
Next,  the  method  [10]  proceeds  with  the  failure  paths  which 
have  not  produced  any  success  2-composite  paths,  and  com¬ 
bines  them  to  generate  higher  order  composite  paths.  Each  itera¬ 
tion  checks  for  success  using  and  removes  the  simple 
paths  which  have  already  combined  to  produce  success  com¬ 
posite  paths.  The  method  [10]  terminates  when  no  moi*c  sim¬ 
ple  path  is  available  to  generate  k-compositc  paths.  A  counter 
example,  presented  m  Appendix  B.  proves  that  [10]  fails  in 
general  to  give  correa  result.  Moreover.  [10]  does  not  provide 
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any  procedure  to  compute  the  capaciry  ot  a  composite  pads 
which,  m  turn,  is  useruJ  to  decide  the  success  or  failure  sutc 
of  the  network.  Recently,  [12]  proposes  a  technique  to  generate 
the  CRR.  .Aggarwai  [13]  menaons  that  the  method  in  [12]  lacks 
generality  and  provides  incorrect  results.  We  have  given  a 
counter  example  (refer  to  Appendix  B)  to  show  chat  the 
aigonthffl  to  [13]  leads  to  a  wrong  conclusion  too.  The  basic 
problem  with  [12.13]  lies  m  the  procedure  used  to  compute  com¬ 
posite  path  capacity.  Le  &.  Li  [20]  have  addressed  the  problem 
of  reliability  of  networks  with  dependent  failures  and  multimode 
components.  They  assign  link  capaciaes  and  obtain  a  numencai 
Sgure,  not  the  reliabdity  expression,  for  the  network,  Rai.  « 
al  [11]  and  Ruegcr  [14]  have  proposed  algonthms  which  ob¬ 
tain  a  symbolic  CRR  expression  under  a  capacity  constiamt. 
However,  their  algorithms  suffer  from  the  drawback  of 
generaong  a  huge  number  of  redundant  paths/cuts.  They  are. 
thus,  impractical  even  for  moderate  sixed  graphs  where  the 
number  of  paths  is  more  than  tec.  Recently,  Rai  &.  Soh  [1]  have 
proposed  two  algorithms  to  enumerate  composite  paths.  The 
composite  path  ecumeraDOC  technique,  presented  in  this  paper, 
is  an  improvement  over  the  algonthms  of  [1],  Section  5  pro¬ 
vides  a  detailed  discussion. 

The  layout  of  the  paper  is  as  foUows;  Sectioo  2  desenbes 
the  background  tnatenai.  Secoon  3  presents  composite  path  con¬ 
cepts  and  the  issues  related  to  its  enumeranon.  It  also  mtrodnees 
several  defrmdons  and  theorems  which  are  useful  m  reduemg 
the  complexity  of  the  composite  path  enumeranon  (CPE)  ap¬ 
proach.  Secnon  4  desenbes  the  CPE  technique  that  is  further 
illustrated  by  an  example.  An  informal  proof  of  the  algonpdun 
and  its  ome  complexity  analysis  are  also  given  in  this  secoon. 
SecOon  5  discusses  the  experimental  results  on  the  CRR  index, 
obtained  using  CAR£L(4],  for  three  networks  with  various 
values  of  ^aan-  finally,  Appendix  A  provides  an  implementa- 
Qoo  lecbmque  for  the  CPE  algonthm  usmg  bit  vector  represen- 
tanon  and  Appendix  B  gives  the  counter  examples  to  the  methods 
[10.13], 

2.  PRELIMINARIES 

In  the  graph  model  G(  V.E)  of  a  lelecommuxucanon  net¬ 
work.  consider  an  edge  j  has  a  Snite  capacity  wj  which  is 
known  a  pnon.  Let  I  be  the  total  number  of  edges  m  G{  V,E) . 
A  Oow  in  a  network  is  a  funcOon  assigning  a  non-aegative 
number  fj  to  each  edge  J  so  that  fj  Wj,  and  for  a  vertex  (that 
is  neither  source  nor  terminal)  the  in-  and  out-flow  are  the  same 
(flow  conservanon) .  Note,  Wj  provides  a  bound  on  flow  pass¬ 
ing  through  edge  j.  The  network  is  good  if  and  only  if  a  specified 
amount  of  signal  capacity  ( ^  transmmed  from  the 
mput  to  the  output  node  or  (r.;)  node  pair. 

An  edge  j  is  said  to  be  UP  (DOWN)  if  it  is  funenoning 
(failed).  An  UP  (DOWN)  edge  is  denoted  by  j  [j  ) .  An  {s.i) 
cut  IS  a  disconnecting  set.  All  commumcanon  between  a 
presenbed  (s,/)  node  pair  is  disrupted  once  the  edges  in  (r.r) 
cut  fail.  An  u.t)  cut  i.  C„  is  minima]  if  no  proper  subset  of 
it  represents  a  cut’.  The  cut  set  C,.  is  the  set  of  all  minimal 
cuts  for  the  graph  Gl^.E).  Let  the  total  number  of  cuts  be  n. 


The  capaciry  of  a  cut  set.  H^(C,).  for  a  mixumai  on  C.  s  the 
sum  of  capacmes  of  edges  m  C.  From  .mx-flow 
theorem  [18],  the  maximum  capaciOf'  flow,  tfarouga  the 
graph  G(y,E)  is; 

=  am{W(C,)].  (I) 

I 

As  an  example,  consider  a  bridge  network  shown  m  figure  1 . 
The  cut  set  Q,  IS  {(1.2).  (1,3.5),  (2,3,4).  (4,5)}.  Using  [18], 
the  capaciaes  are  ^^(C,)  =  14,  W(Ci)  =  19.  f^'(C))  =  12. 
and  W(C«)  =  7.  Here,  edge  capaanes  are  vv,  =  10,  w,  = 
4,  Hc  3  5,  >v«  =  3,  and  w,  «  4.  Now,  using  (1),  W  = 
min{14.  19.  12.  7}  =  7. 


Figure  1.  Bridge  Networx 

[The  link  cap^ity  is  shown  within  ( ).] 


A  simple  path  i,  P„  for  an  (s,/)  node  pair  is  formed  by 
the  set  of  UP  edges  such  that  no  node  is  traversed  more  than 
once.  Note,  any  proper  subset  of  simple  paths  does  not  result 
in  a  path  between  these  two  node  pairs.  The  path  set  is  a 
set  whose  elements  are  simple  paths.  Let  /n  be  the  total  number 
of  simple  paths  in  G{  y,E).  The  capacity  of  a  sunple  path  P,. 
W(^,),  is  obtained  from  the  capacities  of  UP  edges  contained 
in  P,  and  is  [11]: 

fy(p,)  *  j  ^2) 

The  capacity  for  path  Pi  =  (1,4)  in  figure  1  is  8^'(Pi)  = 
®in{>vi,  w^}  3  3. 

If  W(/»,)  >  the  P„  in  addition  to  satisfying  the  con- 
necnviiy  requirements,  fulfills  the  capacity  constraint  too.  The 
path  Pi  is.  then,  called  a  success  state  of  G(  y.E).  Otherwise, 
the  P,  represents  a  fiulure  state.  Note,  in  the  event  that  edge 
capacities  are  infinitely  large,  all  sunple  paths  form  success 
states  because  they  do  provide  (r.r)  cotmeenviry,  and  their  suc¬ 
cesses  ensure  the  network  success.  However,  for  a  fimie  capaa- 
ty  sitnatioa,  all  simple  paths  may  or  may  not  lead  to  the  suc¬ 
cess  states  of  G(y.E).  Depending  on  some  or  all  sim¬ 
ple  paths  may  fail  to  satisfy  the  capaoty  constramt.  Thus,  sun¬ 
ple  path  (minimal  cut),  an  unportant  concept  in  terminal  reliabili¬ 
ty,  has  to  be  revuited  while  considering  the  CRR  measure.  The 
concept  of  composite  path  (introduced  in  Secoon  3)  is  a  step 
m  this  direction. 
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3.  COMPOSITE  PATH 
3. 1  Concept  and  Issues 

Defininon:  A  i-composite  path  CP/(I:)  is  defined  as  ihe  union 
of  s«  of  edges  in  any  at  simple  paths  P's.  where  i  €  /,  and 
(ISifcSm).  Z 

Note,  a  I:-composiie  path  descnbes  a  subgraph  of  GtV.E). 
Moreover,  ail  simple  paths  represent  Ic-compostie  paths  where 
k=l. 

Exampie.  Consider  the  bndge  network  in  figure  1 .  The  path 
set  P,j  of  the  network  is  ;  P,  :  ( 1.4).  Pj .  ( 1.3.5).  P3  :  (2.5). 
and  P4  :  (2.3,4).  From  the  P,j  we  generate  the  2'compostte 
paths: 

CPij(2)  =»  (1.3, 4.5).  CPij(2)  =  (1,2.4.5). 

C?i.*(2)  «  (1,2,3 .4).  CP,j(2)  »  (1. 2.3,5), 

CP2,«a)  -  {1,2.3.4.5).  and  CP3.4(2)  -  (2,3.4.5). 

Lemma  1 .  For  m  simple  paths  representing  (r.r)  connectedness 
of  the  network,  the  total  number  of  possible  k-composite  paths 
is  (2*-l).  C 

Misra  &.  Prasad’s  technique  [10]  reduces  the  generahon 
of  all  (2*  - 1 )  possible  k-composite  paths.  However,  a  counter 
example  in  Appendix  B  shows  the  method  in  [10]  lacks  geoentli- 
ty.  Another  problem  in  composite^path  enumerahon  stems  from 
the  capaacy  computation  for  the  CPi(k).  The  capacity  of  a 
simple  path  W{P,)  is  obtained  easily  from  (2).  However,  we 
need  to  devise  an  efficient  technique  to  get  the  capacity  of  a 
CF/(k)  for  k  >  1.  Ref  [10]  does  not  discuss  any  method. 
Papen  [12.13]  have  proposed  techniques  to  help  obtam  the 
capaaiy  of  a  CP/(k) .  However,  [13]  menhons  that  the  method 
in  [12]  is  not  correa.  Appendix  B  provides  a  counter  example 
to  show  that  [13]  also  fruls  to  generate  correct  results  under  cer¬ 
tain  situaoons.  We  provide  a  lemma  to  evaluate  the  capaaty 
of  a  composite  path  CP/(k),  for  k  >  1.  For  diis,  we  introduce 
the  concept  of  Composim  Path  Cut  (CPC). 

Definition:  A  CPC^O)  is  4  modified  (j.r)  cut  C,  for  the  graph 
G{V,£)  and  is  defined  for  a  composite  path  CP/(k).  The 
CPC,0)  is; 

CPCiU)  *  CP,(k)  n  Cj  ;;  =  1 . n.  (3) 

n 

Since  a  CP,(k)  descnbes  a  subgraph  of  G(  y.E) ,  the  CPC((/) 
represents  a  cut  for  the  CP/(k)  induced  graph.  The  failure  of 
the  edges  in  the  CPC/ O')  leads  to  C?r{k]  communicaaoo 
disruption  between  a  prescribed  (i.r)  node  pair.  Note,  there 
exists  n  number  of  CPC/0) 's  *  composite  path  CP/(k). 

Lemma  2.  The  weight  of  a  composite  path.  lf'(CP/(fc)),  is; 
W(CP/(k))  »  mm  {W(CPC,0))}  W 
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where  V/{C?C/lj))  represents  the  weight  of  a  CPC, 0)  and  is 
obcaincd  bv  applying  d)  to  vanous  CPC/0)’s-  Note,  for  k-  [. 
Lemma  2  gives  the  same  results  as  obtamed  by  (2).  Z 

Example.  Consider  figure  4  and  table  I  where  a  composite  path 
CP/(2)  =  (1.2,5,6)  hasCPC/0)sas  {(1.2),  (5,6).  (1,5). 
(1.5).  (2.6).  (5,6).  (2.6)}.  The  weight  of  the  CP/(2)  is. 
thus.  11  units. 

TABLE  1 

Pathset  and  Cutset  for  Figure  4 

Ptmtn  Cuaet 

f,  -  (1.6)  C,  »  (l.i) 

/*}  -  (2d)  Cj  ■  (3.6.8) 

Fj  •  (1.7.8)  C,  -  (U.3.8) 

P4  »  (1.3.3)  C,  -  (1,3.4J) 

•  (2.4.8)  Cf  m  (2,3.6.7) 

Pi  -  (2.3.6) 

Pj  =•  (1. 4.3.7)  Ci  M  (4.S.6.7) 

P,  -  (2.3.7.8)  Cr  •  {2.3.4.6.8) 

/*♦  -  (I.3.4.8, 


Definition:  A  composite  path  CP/(k)  is  a  success  state  of  the 
network  if  it  satisfies  the  capacity  or  fiow  constraint; 

JP(CP/(k))  2  (5) 

Otherwise,  the  CP/{k)  is  a  frulure  state.  Moreover,  a  CP/(k) 
is  defined  as  a  redundant  sate  of  the  network  if  there  is  at  least 
one  success  state  CP/(u)  such  that  CP/(u)  S  CP/(k).  Z 

Definmon-.  A  cross  Jink, (k) ,  defined  for  the  composite  path, 
is  the  set  of  links  common  to  the  k  simple  paths  forming  the 
CP;{k).  l_j 

Defininon:  The  weight  of  a  cross  Jink, {k),  W(crossJink,(k) ) , 
is  computed  following  (2).  Z 

The  notion  of  a  cross  Jink  and  its  weight  are  used  to  detea  a 
failure  k-composia  path  a  pnon. 

Theorem  J.  CP,(k)  Is  a  failure  sate  if  fy(crossJink,(k) )  < 
fPstt.  for  cross  Jink, {k)  i*s  ♦.  Z 

Proof.  Since  every  element  in  crossJink,{k)  is  a  link  in  all  the 
k  paths  that  form  the  CP/(k) .  the  flow  in  CP/(k)  is  limited  to 
the  weight  of  the  crosjJink/(k).  Q.E.D. 

Thus,  before  generating  the  CPC,(j)  set  for  a  CP/(k),  use 
Theorem  1  to  see  if  the  CPi(k)  is  a  htilare  state.  Obtain  the 
CPC,(jU  only  when  the  crossjink  concept  fails  to  detea  a 
fuluTB  sate.  This  happens  when  W{crossJink,{k))  2  H^sna 
or  the  k  simple  paths  have  no  links  in  common,  ie, 
crossJink,(k)  =»  ♦.  Initially,  each  crossjink, ( 1 )  of  CP/l  1 ) 
is  given  by  simple  paths  m  Fjetfl).  Later,  for  each  CP,(k). 
update  the  avssjink,(k)  by  taking  the  set  theoretic  intersec¬ 
tion  of  two  cross  Jink's  of  (2P((k-l)s  that  merge  mto 
CP,(fc)s. 
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3.2  Development  of  the  Technique 

The  proposed  CPE  algorithm  starts  from  the  failure  paths 
and  utilizes  divide  and  conquer  strategy  to  reduce  the  complexity 
of  the  problem.  In  Secaoo  2,  we  discussed  success  and  failure 
states  of  a  network  firom  the  aspca  of  simple  paths.  For  a  given 

capacity,  first,  test  all  simple  paths  and  paniooD  them  into 
funcQonmg  (SLset)  and non-fiinctiomng  {F^sei  (1))  groups. 
Let  the  number  of  elements  in  Fjsetf  1)  be  3.  The  F_set(  1 ) 
is  then  used  to  obtain  l:-composite  paths,  1  <ks3.  Lemma  2 
(and  the  defimhon  foUowmg  the  lemma)  is  utilized  to  parnoon 
C?i{k)i  into  the  S_set  and  F_salk)  caiegones.  Note,  both  the 
S.jet  and  F^sci(k)  can  have  many  redundant  terms.  The  redun¬ 
dancy  (dupUcanon  and  absorpnon)  is  checked  easily  with  the 
help  of  a  Boolean  identity  A  U  AB  •  A.  However,  if  special 
care  is  taken  in  the  enumeradon  procedure,  many  of  the  redun¬ 
dancies  can  be  avoided  in  the  fiirst  place.  We  have  observed 
that  the  number  of  redundant  composite  paths  is  reduced  when 
the  concept  of  path  composidon  is  recursively  applied  only  to 
the  failure  states  of  the  network.  Addidonally,  use  a  path  graph 
PG(V.£)  to  obtain  higher  order  (k  >  2) compositions. 


Figure  2.  5-Node,  d-Unk  Network 
(The  link  capacity  is  s.hown  within  ( ).| 


Figur  j  3,  Path-Graph  PG(V,E) 


Figure  3  illustrates  a  path..graph  PG<VJE).  Here,  a  node 
F,  belongs  ®  F-set(l).  An  edge  Joining  any  two  nodes  F«  with 
P,  is  defined  by  a  2-composite  path  in  /  jset(2).  Note  that  F, 
PJPwlPJP'fPwPx)  generates  a  3(4>-coinposition.  It  is  reflected 
by  a  3f4)-node  complete  grapb  or  clique  (19]  in  PG(VJJ).  In 
general,  a  y-composite  path  is  a  y-clique  m  the  path_graph.  The 
problem  of  composite  enumeration  is,  thus,  similar  id  detsr- 

mimag  cliques  in  PG<yjE).  Note,  each  state  in  Fjet(y)  is  a 
clique.  Thus,  generating  higher  order  cliques  is  straightforward. 
Two  (y- D-cliques  may  form  a  y-cIique.  Take  XOR  set 


thcorcnc  operation  on  the  elements  of  the  rwo  ower  order 
cliques.  If  the  result  is  an  edge  in  PGtWj;),  me  two  ,j  —  ; 
cliques  produce  a  y-clique.  The  elements  of  the  y-ciiquc  are  oo- 
taincd  by  using  union  set  theorenc  operanon  on  the  elements 
of  the  (y  -  1  )-cliques.  This  process  is  further  explained  ;n  Sec- 
non  ■». 

3.3  Effiaent  Generanon  of  2-Composite  Paths 

Algonthm  I  in  [1]  generates  3(3 -DTI  2-composite  paths 
or  links  in  PGfV.E).  To  further  reduce  the  complexity  of  this 
algonthm.  we  use  the  concepts  of  keyjtut,  key  Jinks,  and 
path^roups  defined  as  follows. 

Defimhon:  minimal  cut  C,  is  defined  as  key_cut  if:  i)  w,  < 

lOf  ally  €  C„  and  li)  (wj  w,)  >  for  at  least 
one  y,  I  psur;  j,  I  ^  Cr  Z 

Condinon  i  is  required,  while  condition  li  ({w,  ■¥■  w,)  2 
is  unportant  as  it  helps  identify  the  failure  composite  paths  a 
pnon.  If  condition  li  fails  ( +  w,>  <  for  all  y,  1  pair) 
the  2-composmons  become  ^ure  states  for  the  nehvork.  .All 
filing  sunpie  paths  are,  then,  used  to  generate  ^(fi-!)/2 
&ilujre  2-composite  paths  as  discussed  in  subsection  3.2.  To  il¬ 
lustrate  a  keyjmt,  consider  a  minimal  cut  (4,5)  for  the  bndge 
network  in  figure  1 .  Capanhes  of  links  4  and  5  are  individual¬ 
ly  less  than  (  =6)  while  the  capaaiy  of  the  minimal  cut  ' 
(4,5)  is  greaier  than  W^.  Thus,  the  cut  (4.5)  represents  a  ' 
keyjaa. 

Definition:  The  cardinality  of  a  keyjsa  (denoted  as  |  keyjims  I )  ! 
is  the  total  number  of  elements  m  it.  Selea  C,  as  a  keyjaa  if  —  j 

i.  C,  is  either  a  source-node  or  tanninal-aode  cut.  If  not. 

Q  should  have  at  least  one  link  connected  to  the  source  or  the 
terminal  node,  and 

ii.  |C,|  is  minimum.  If  |C^j  is  equal  for  rwo  or  more 
terms,  take  arbitrarily  one  C,.  Z 

Definition:  The  elements  of  a  selected  keyjna  are  termed  as 
keyjinks.  Z 

Definition:  A  pcuh_group  </(i? )  is  the  set  of  failure  paths .  each 
of  which  contains  a  keyjink  rf.  Z 

Without  loss  of  generality,  a  feilure  path  which  contains  two 
or  more  keyjinks.  is  assigned  to  a  pathj^roup 

where  Tj  is  lexicographically  smallest  of  {a . 

This  ailocanon  strategy  is  useful  to  minimize  the  number  of 
path_groups  generated. 

Example.  Consider  all  four  simple  paths  (1.4).  (2.5),  1 1 .3.5), 
(2,3.4)  in  figure  1  as  failure  states.  Grouping  them  with 
keyjinks  4  and  5  we  have: 

G{4)  =  {(1.4).  (2.3.4))  andG(5)  =  {(2.5).  (1.3.5)} 

Theorem  2.  A  k-composite  path  CP/{k)  within  a  path^roup 
C(7f)  is  a  failure  suie.  Z 

Proof.  From  the  definition  of  a  path^roup  Oin).  a 
crossjinkfik)  obtained  from  the  CP,(k)  contains  at  least  one 
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key  Jink.  Using  Theorem  1  and  the  denmnon  of  a  key^ciu.  the 
fViCPiik))  IS  less  than  Q.E.D. 

Theorem  3.  The  upper  bound  on  the  number  of  2-composite 
paths  is: 

? 

-  ‘<;- 
V 

Here  Si  and  Sj  are  |G(j)i  and  |GO')l.  respectively.  More¬ 
over,  Z  3i  ~  m  (or  3)  and  y  is  the  total  number  of  groups.O 

( 

Proof.  The  number  of  2-composite  paths  is 


which,  if  solved,  gives  the  result.  Q.E.D. 

Theorem  4.  The  number  of  l:-composite  paths  is  bound  by 
iVt/lVk_i  s  m  —  k+l/k,  where  iV*.,  represents  the  number  of 
(*- l)-composite  paths.  □ 

Proof.  For  the  upper  bound.  Theorem  4  is  true  considering 
and  as  binomial  coefScients  ( 2*)  and  -  Theorem 
3  shows  that  S2  is  less  than  ( 7).  Thus,  V3/iV2<''‘“2/3. 

Q.E.D. 

Note,  to  generate  Ur-composite  paths  {k>2),  we  also  have 
to  consider  all  (ib- l}-composite  paths  formed  from  the 
elements  within  a  patK-group.  We  call  the  set  of  such  cotnposixe 
paths  an  F!.,set_dummy(ib~  1).  Although  all  states  in  /Lset_ 
dummylib- 1)  are  &ilure  states  (refer  to  Theorem  2),  diey  can 
generate  success  states  to  obtain  higher  order  states  (cliques). 


4.  COMPOSITE-PATH  ENUMERATION  TECHNIQUE 
4. 1  CPE  Algorithm 

Udlizmg  concepts  in  subsections  3.1  through  3.3  a  CPE 
algorithm  is  developed  as  follows. 

1.  Read  input  files  having 

•  path  set  and  cut  set; 

•  link  capaades  wj,  for  y=  I . i; 

•  ^am. 

2.  Get  =!  tnin('’''(C<)}; 

if  ( *«»  g0*0  5; 

3a.  Determine  success/ failure  states  from  path  set;  create 
FLseid)  and  5«set  for  failure  and  success  paths, 
respectively; 

3b.  If  (l/Lsctd)!  >  1)  then 
find  a  keyuntf. 
else  goto  5; 

3c.  If  (keyjM  exists)  then 
begm 

create  paih^roupt: 

generate  CP,{2)’s  using  the  path  groups-, 
create  F_set_dununy(2)  formed  by  2-composite 


paths  withm  each  path^roup. 

end; 

else 

begm 

generate  CP/(2)'s  direcdy  from  Oetd): 
F«set_dummyt2)  =»  {  } 

end; 

3d.  Detenmne  success/Mure  states  for  all  CP/(2)s;  '*  use  (4)  •/ 
append  non-redundant  success,  (2P/(2)s  to  5_sei: 
append  non-redundant  failure.  CP;(2)s  to  F..set(2); 

4.  Initialize  PG_5et  =  f_setf2)  U  F..set_dummy(2); 
for  ( (Jbsa3...3)  and  (|f!,set(jb- 1)|  >  1))  do 
begin 

Use  PG_set  to  test  for  a  Jb-composite  path 
foUowmg  the  method  descnbed  in  subsecnon 
3.2  •/ 

generate  non-redundant  CP,(Jb)  's  from  F^sei  ( b  - 1 ) ; 
generate  non-redundant  CP/(b)’s  from  combiaanon 
of  F_set(ib  — 1)  with  Fjset,.dummy(l:- 1); 

/•  for  each  C?i(k)  */ 
if  (W  (crossjink, (k))  <W^)  then 
success*  FALSE; 
else 
begin 

obtain  C?C,(j)  set  for  each  C?i(k)\ 
if  (  {J^(CPCfO))})*^aw 

success  *  TRUE;  /*  CP/(Jb)  is  a  success  '/ 
end 

if  (success)  then 

append  the  success  CP;()b)  to  5_set; 
else 

append  the  failure  CPi(k)  to  FLsetfb); 
create  /v_set_dumtiiy  (b)  from  FLsei_dummy  (b  - 1 ) ; 
Fjei(k)  =  FLset(b)  U  FLsecdummy (b); 

end; 

5.  end, 

4.2  Proof  of  Correctness 

The  CPE  algorithm  solves  two  main  problems; 

1.  (jenerating  suffiaent  number  of  CP/(b)'s  which 
should  lead  to  all  possible  success  composite  paths  m  a  network. 

2.  Evaluaong  the  capaaiy  of  a  composite  path  to  check 
for  success/foilure  stares. 

In  what  follows,  we  show  that  the  CPE  algorithm  always  gets 
a  solution,  if  it  exists. 

•  (Zorrecmess  proof  for  solving  problem  1. 

Lemma  3.  Let  CP/(b)  and  CP/(e)  be  success  states.  If  C?,(k) 
S  CP/(c)  dien  CP,(c)  is  a  redundant  success  state  and. 
hence,  need  not  be  generated.  □ 

The  CPE  algorithm  starts  from  fiulure  simple  paths.  FoUow- 
ing  Lemma  3.  we  need  not  generam  higher  order  of  composite 
paths  from  success  simple  j»ihs.  The  CPE  algorithm  generates 
sufficient  CP,(2)’s  to  obtain  all  possible  success  CP;(2)  s 
(refer  to  subsection  3.3).  For  generatmg  higher  order  states. 
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the  CPE  uuiizes  path-graph  PGt'V.E)  whose  codes  and  links 
are  f_sct(l)  and  F_sei(2),  respecQvely.  The  problem  ot 

enumerating  3-  ,  -i-  .  <-composjte  paths  is  conceptually 

eqiuvaiem  to  generating  3- .  4- . i-ciiques  in  the  path-grtxip 

PG<VJE).  Following  Lemma  3.  it  is  obvious  that  if  key_cut  does 
not  exist,  the  CPE  aigonthm  generates  sulfiaent  k-ciiques  since 
it  always  generate  i<liques  tiom  all  combmanons  of  failure 
(k—  I  )-ciiques.  When  a  key_cui  exists,  the  CPE  uses  keyjxx, 
key  Jinks,  and  paih_groups  to  recognize  some  failure  composite 
paths  a  pnon.  Since  the  CPE  rminrams  F_sci_dummy(/k)  to 
keep  those  predioed'  failure  k-composite  paths  and  later  utilizes 
it  to  form  (k-i‘  1  )-ciiques  with  F^i(k),  for  this  case,  the  CPE 
also  produces  sufficient  k-cUques. 

•  Correemess  proof  for  solving  problem  2. 

The  algorithm  uses  (2)  (a  special  case  of  I^emma  2)  and 
2  to  evaluate  the  capacity  of  simple  paths  and 
C?i(ky%.  respecnvely.  Note  a  k-composite  path  is  a  subgraph 
of  the  network  G(  V.E) .  Following  the  definition  of  CPC/(/) 
(refer  to  subsection  3.1),  it  is  obvious  that  the  CPC/0)  set 
represents  the  cutsets  for  the  CP/(ik)  induced  graph.  Thus,  by 
using  max-flow  nun-cut  theorem  [18]  on  CPC/0),  we  obtain 
the  max-flow  or  capacity  of  the  subgraph  CP/(lc) . 


4.3  Compiuaiwnai  Time  Compienry 

It  is  shown  in  P]  that  network  reliability  is  an  NP-complete 
problem.  The  CRR  problem  is  a  network  reliability  problem 
and,  hence,  it  is  difficult  to  compute  its  complexity.  The  CPE 
techmque  is  an  aigonthm  whose  perfonnance  depends  on  the 
number  of  CP/(k)'$  generated,  which  in  cum  relies  on  the 
topology,  link  capacities,  and  minimiini  message  bandwidfii 
( for  rfie  network.  Note,  Appendix  A  provides  the 
subroutmes  that  are  used  to  implement  the  CPE  technique,  and 
shows  that  each  subroutme  is  a  polynomiai  dme  fimedon.  In 
the  following,  we  give  the  complexity  of  the  CPE  aigonthm 
using  some  notaoon  defined  below.  For  a  given  network  and 
a  certain  state  k,  let  and  be  the  number  of  non-redundant 
success  and  failure  CP/(*)'s,  respectively.  Let  r  be  the  total 
number  of  redundant  CP/ik)  ’s  enumerated  in  the  network,  and 
^ be  the  highest  it-composmon  generated  le,  l^ksX.  Note, 
in  general.  AT  is  less  than  the  total  number  of  paths  m.  The  total 
number  of  states  generated  for  a  network  is.  thus,  computed  as: 

/r  K  K 

E  ~  Yj  and  a  =  ^  V 

t-2  kml  kmZ 

It  is  obvious  that  g  s  m. 

For  each  step  k,  each  CP/(ifc)  is  first  checked  for  its 
redundancy  against  all  success  composite-paths  and  CPf(k)s 
generated  so  far.  Note,  using  set  theoretic  operation,  a  CP/ik) 
can  be  tested  for  its  redundancy  against  any  CPj{k)  in  constant 
dme.  Thus,  the  amc  complexity  to  detea  the  r  number  of  rodun- 
dam  CP/(k)s  is: 


f,  =  ofrnr  -t-  ^  /It ), 

where  denotes  the  number  of  redundant  states  m  step  k.  The 
capacity  of  a  CP/ (k)  is  computed  m  0(n)  {refer  to  .'\ppcndix 
A)  and.  thus,  the  capacity  computanon  for  all  non-redundant 
states  enumerated  is  given  as  ^  0{mn  -i-  hnj.  Given  r,  and 
the  time  complexity  of  the  CPE  aigonthm  is: 

tcre  =*  L  +  tc  =  o(mn  +  m'  V  -i-  hnY 

^  k.Z  ^ 

It  is  obvious  that  the  bottleneck  of  CPE  technique  lies  m  the 
number  of  failure  i-composite  paths  h  which  can  reach  up  to 
0(2"). 

4.4  lUustrazing  Example 


7  (31) 


Figur»  4.  S-Nods.  3-Unk  Network 
[The  link  capacity  is  shown  within  ( ).] 

For  figure  4,  let  the  link  capacides  be: 

1  2  3  4  5  6  7  8  links 

5  6  23  10  12  9  31  32  link  capaaty 

Various  assumpdons  are  given  in  [1].  It  is  also  a««uTnMi  th^ 
(s.r)  path  set  and  cat  set  be  given.  Table  I  illustrates  these  sets 
of  iaformadon.  Using  (1),  we  get  =  11.  The  weight  of 
each  path  is  checked  against  a  given  value  of  (  =  10) .  It 
is  found  that  none  of  the  eight  paths  forms  the  success  state  for 
the  network/sysicm.  One  way  to  form  2-composinons.  CP/ 
(2)’s,  is  to  generate  all  (  ^ J  =36  composite  paths  of  size 
2.  Altemadveiy,  we  may  de&e  a  key^eux  from  the  cut  set.  In 
the  present  example,  the  cut  C|  =  (1.2)  qualifies  to  be  the 
key  .put.  Consider  its  dements  1  and  2  as  the  key  Jinks,  and  par- 
ddon  the  9  failure  paths  into  2  groups  as: 

G(l)  =  {(1,6),  (1.7.8).  (1.3.5),  (1.4.5.7).  (1.3.4.8)}; 

G(2)  =  ((2.5),  (2,4,8),  (2,3,6),  (2,3.7,8)}. 

Here  |C(1 )  I,  and  |G(2)|  are  5  and 4.  respecnvely.  Thus,  we 
need  to  generate  only  (36-16)  =  20  2-composite  paths.  They 
are: 
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cPi^av  cPi.5ai.  cPi.^a).  cp,.aa),  cP:.i(2).  cp.^^o. 
CP:.7<2).  CP:.9(2).  CPjj(2).  CPj.jCZ),  CP3,s(2),  CPo(2). 
CPa6^21,  CPas^).  CPj,r<2).  CP,., (2).  CP6,7(2).  Cp4,,(2). 
CP7.g(2\  CPg,,(21. 

Here,  ail  CP,(2)’s  but  for  CP4,g(2)  are  success  states  for  the 
network  and,  hence,  we  obtain  F_sct(2)  =:  {CPaj(2)}.  Since 

1  F_sct(2)  t  =  1 .  we  need  not  generate  3-  or  higher  order  com¬ 
posite  paths.  SubsQtuting  for  various  CP/(2)’s  from  able  t. 
and  deleting  the  redundancies,  we  obtain  3  composite  paths: 

{(1.2,5,6'),  (1,2.4.6.8'),  (1,2.5.7.8).  (1.2.3.5).  (1,2,4, 5,7). 

(1,2,4.7.8).  (1,2,3, 4.8).  and  (1.2.3,7.8)}. 

Using  CAREL  [4],  the  (2RR  expression  is: 

CRR  =  pu>2P-iPt  +  PjP'J>iP6(^-  +  P^MhPiPii^^Pi) 

+  P\PTp*PiPt{'^-p^){\-p<p6) 

4-  PiPTP,PiPi{\-ihi){\-P(,){\-Pt) 

+  Pi/>:PsP7Pg(l'"P3)(l-P4)(l-Ps) 

4-  PlP2P4P<iP«(l-P3)(l-P7) 

■+•  P\P2PjP*Pt  ( 1  -Ps)  ( 1  -p«)  ( 1  -pr). 

Subsdiuting  known  values  for  link  reliability  P(’s,  we  obtain  a 
numerical  value  for  CRR.  For  example,  if  pj  =  0.9  for  all  i, 
CRR  =  0.799655. 

5.  DISCUSSION 

The  proposed  CPE  technique  is  simple,  and  is  inaplemeoted 
in  C  on  an  Encore  Multimax  system.  Tables  2a,  3a.  and  4a  il¬ 
lustrate  the  number  of  failure  simple  paths,  success  states,  and 
reliability  values  (CRRs)  for  the  networks  shown  in  figures  4-6, 
respectively.  Various  values  are  considered  to  generate 
these  sets  of  infonnadon.  To  check  for  the  correcmess  of  the 
result,  we  have  compared  the  success  states  obtained  using  our 
CPE  algonthm  with  those  obtained  by  (1,  algorithms  I  &  2] 
and  an  exhaustive  method.  Note,  we  refer  to  [1,  aigorttfams  I 
&  2]  as  Aigl  and  Alg2  respectively.  For  the  exhaustive  method, 
we  generate  aU  possible  combinrtions  of  failure  simple  paths 
(which  equal  to  2'"-m-  I  states  in  the  worst  case).  [,aTima 

2  is  used  to  check  each  of  the  states  to  determine  success  states. 
Delete  the  redundant  terms,  if  any.  The  results  generated  from 
our  CPE  algonthm  exactly  match  with  those  obtained  from 
Algi ,  A]g2,  and  the  exhaustive  method.  On  Encore  Multimax 
system,  the  CPU  tune  for  different  entries  in  table  4a  ranges 
from  0. 1  to  0.3  seconds,  while  it  took  27.6  seconds  CPU  time 
(more  chan  4  hours  real  time)  using  exhaustive  method 
(generating  2^-26  states).  Note,  for  all min{wy}  < 

S  since  for  smaller  or  equal  to  mm  {wj},  all  the 
sunpie  paths  will  always  be  success  states. 


table  2a 
i^OSUltS  for  FTgura  4 


So. 

ty 

Ftilure  Pmu 

Success 

CRR  r 

1 

6 

5 

4 

0  897407 

7 

9 

9 

0.806806 

3 

s 

9 

9 

0.806806 

A 

9 

9 

9 

0.806806 

S 

10 

9 

S 

0.799655 

6 

!1 

9 

7 

0.799064 

TABLE  2b 

Results  for  Figure  4  (Continued) 

CPi{k)%  Geaetaied 

No* 

Exbiusave  Aigl  .Alg2.CPE 

SivTug’  of  CPE* 

1 

6 

26 

36 

36 

■e 

2 

7 

502 

73 

20 

0 

3 

& 

502 

73 

20 

0 

4 

9 

502 

73 

20 

0 

5 

10 

502 

73 

20 

1 

6 

11 

502 

73 

21 

3 

Hiak  reliiMity  »  0.9  fbr  ill  cue*. 
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TABLE  3a 
Rasufls  for  Figure  S 


No. 

Ftilure  Pubs 

CRR  0 

1 

2 

5 

13 

0.977923 

2 

3 

10 

16 

0.974977 

3 

4 

13 

29 

0.949844 

4 

5 

14 

26 

0.857353 

5 

6 

14 

IS 

0.SI5038 

6 

7 

14 

13 

0.737874 

7 

8 

14 

8 

0.586283 

8 

9 

14 

4 

0.449795 

9 

10 

14 

1 

0.313811 

TABLE  3b 

Results  for  Rgure  5  (Continued) 

CP, 

(t)s  Genersied 

No. 

•'a* 

Exbausove 

Algt  Alf2.CPE 

'Stviag'  of  CPE' 

1 

2 

26 

21 

21 

5 

2 

3 

1013 

88 

88 

7 

3 

4 

8178 

227 

227 

17 

4 

5 

16369 

T33 

733 

32 

s 

6 

16369 

1166 

731 

53 

6 

7 

16369 

1501 

1043 

55 

7 

S 

16369 

1934 

1436 

53 

8 

9 

16369 

2139 

949 

123 

9 

10 

16369 

2160 

1363 

125 

Itiak  relitlnlicy  •  0.9  for  tU  cases. 

The  difference  berweeo  CPCf(i)t  genenied  by  Alg2(ll  tnd  CPE 


’■ABLE  3a 
Results  for  Figure  5 


TABLE  ^ 
Results  'or  Figure  6 


No.  ’*TaM  Failure  Pula  Succeas  CRR  0 


l 

5 

13 

0.97T923 

■» 

3 

10 

16 

0.974977 

3 

4 

13 

29 

0.949844 

4 

S 

u 

26 

0.8S7333 

5 

3 

14 

IS 

a.glS038 

6 

•y 

14 

13 

0.737874 

7 

s 

u 

i 

0.586283 

s 

9 

14 

4 

0.44979S 

9 

;o 

14 

t 

0.313811 

TABLE  3b 

Results  for  Rgure  S  (Continued) 

CP, 

(k)t  Geneniad 

No. 

ExJuusuve 

Algl  Alg2.'3PE 

Saving-  of  CPE* 

i 

26 

21 

21 

5 

;; 

3 

1013 

88 

88 

7 

3 

4 

8178 

227 

227 

17 

4 

5 

16369 

733 

733 

52 

5 

6 

16369 

1166 

751 

55 

6 

7 

16369 

ISO! 

1043 

55 

y 

8 

16369 

1934 

1436 

55 

8 

9 

16369 

2139 

949 

125 

9 

10 

16369 

2160 

1565 

125 

4linJc  telialnliiy  >  0.9  for  all  cues. 

'Tbe  diCfereoce  bum  can  CPC,{k)i  geamtei  by  Alg2(ll  and  CFE 


Tables  2b.  3b.  and  Ab  show  the  number  of  composite  paths 
genemed  to  produce  the  ooo-redundant  success  paths  for  me 
networks  m  figures  A-6.  respecuveiy.  No«,  the  tool  number 
of  composite  paths  generated  depends  on  the  given  value  •*'aaa 
and  link  capacities  of  the  network  as  well  as  on  the  network 
topology  i  id.  hence,  is  difficult  'o  determine.  The  tables  com¬ 
pare  the  performances  of  our  CPE  algonthm  with  Algl,  Alg2. 
and  the  exhaustive  method  in  term  of  the  total  number  of  com¬ 
posite  paths  generated  to  obtain  those  success  states  of  the  net¬ 
works.  As  anncipated.  our  CPE  algonthtn  outperforms  both  ex- 
hausnve  and  Algl  [1].  .Nonce  that  for  a  given  value  of  W^, 
the  number  of  states  generated  by  the  CPE  algonthm  and  both 
Algl  and  Aig2  are  the  same  when  key^aa  docs  not  exist  in  the 
network.  Except  for  table  4b.  the  exhaustive  method  enumeiates 
ail  possible  composite  paths.  To  speed  up  our  experiments,  for 
figure  6.  the  exhausnve  method  generates  only  up  to  all  possi¬ 
ble  combmanons  of  6  simple  paths  (the  nerwork  has  at  most 
4-composite  non-redundam  success  paths  obtained  by  our  CPE 
algonthm  and  both  Algl  and  Alg2). 

The  last  column  of  tables  2b.  3b.  and  4b  display  the  ‘sav¬ 
ings'  of  CPE  method  over  Alg2  for  vanous  Since  the 
total  number  of  states  generated  in  both  methods  is  the  same, 
we  compare  their  performances  based  on  the  number  of 
CPC/0)  enumeranons.  Note,  the  CPC/0)s  generated  for  a 


No. 

Ftliure 

SoiXSM 

CIIR  0 

1 

7 

;4 

45 

0  973055 

2 

8 

25 

4! 

0-962361 

3 

9 

25 

4i 

0-962361 

4 

10 

25 

41 

0.962361 

5 

11 

25 

'A'T 

0.934972 

6 

12 

25 

32 

0  934972 

7 

13 

25 

i5 

0  845304 

8 

14 

ZS 

15 

0,691823 

9 

15 

25 

15 

0.691823 

10 

16 

25 

15 

0.691823 

11 

17 

25 

11 

0.648671 

12 

18 

25 

10 

0.640535 

13 

19 

25 

11 

0  626502 

TABLE  4b 

Results  for  Figure  6  (Continued) 


CPiik)t  Genemed 


.No. 

Exbausove 

Algl 

Alg2.CPE 

-Saving-  of  CTE’ 

1 

7 

2^-25 

724 

724 

90 

2 

8 

2^-26 

1055 

341 

135 

3 

9 

2^-26 

1055 

341 

135 

4 

10 

2«-26 

loss 

341 

135 

5 

11 

2«-26 

1766 

403 

180 

6 

12 

2^-26 

1766 

493 

180 

7 

13 

3S31 

1308 

180 

8 

14 

2«-26 

5568 

2124 

179 

9 

15 

2«-26 

5568 

2124 

179 

10 

16 

2“ -26 

5568 

2124 

179 

11 

17 

2«-26 

5653 

2190 

179 

12 

IS 

2»-26 

5656 

2195 

179 

13 

19 

2“-:6 

5671 

2209 

179 

Miaic  rdiability  ~  0.9  for  ill  cues. 

TTie  difference  berweea  CPC,{k)i  jeneniod  by  AJr^Ul  m*!  CPE 


CP;  (/fc)  involves  all  cuts  of  the  network  and.  hence,  are  expen¬ 
sive,  especiaUy  for  networks  with  many  cuts.  The  number  of 
generated  CPC/O)^  method  is  at  most  the  same 

as  for  Aig2.  Reffer  to  the  difference  between  the  two  total 
numben  as  the  saving'  of  CPE  method  over  Alg2.  By  avoiding 
generanon  of  some  CPC/0)s.  CPE  algonthm  speeds  up  the 
capacity  cvaluanon  process  of  Mure  CPC/(it)s.  Thus,  the  un- 
provement  of  CPE  over  Alg2  is  obnous. 

The  reliability  values  (CRRs)  are  obtained  using  CAREL 
[4],  From  these  reliability  values  we  observe  that  there  is 
tradeoff  between  allowing  larger  flow  m  die  network  (make 

Mger)  and  the  reliability  of  the  network  to  carry  out  that 
flow.  And  for  certam  values  of  we  have  a  good  bargain. 
For  example,  in  table  4a,  increasing  the  from  7  to  12 
units  (71,4%  increase)  reduces  the  reliability  of  the  nerwork 
oy  only  0.038082  (3.9%).  If  we  increase  the  from  8  to 
10  units,  the  reliability  of  the  network  is  unchanged,  which  is 
favorable  for  the  user  of  the  network.  On  the  other  hand,  for 
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Rgure  5.  T-Node.  iS^nk  NetvwjrK 
[The  link  capaaty  is  shown  within  ( ).) 


Rgure  6.  7-^Joda.  12-Unk  Network 
[The  link  oapacrty  is  shown  wtthin  ( ).] 


some^  cases,  bcreasing  the  by  I  unit  may  reduce  the 
reliability  of  the  network  appreciably,  which  is  of  course  un¬ 
favorable.  For  example,  in  table  4a.  increasing  from  13 
to  14  units  (7.7%)  reduces  the  reliability  of  the  network  by 
0.153481  (18.2%).  By  utilizing  the  method  presented  in  this 
paper,  we  can  get  the  best  for  certam  network  configura¬ 
tions.  Thus,  the  technique  provides  a  novel  approach  for  com¬ 
paring  the  reliability  of  various  candidate  topologies  having 
heterogeneous  Link-capacities. 


APPENDIX  A 

CPE  ImpUmenuuion 

Data  representation  markedly  affects  the  efficiency  of 
algorithms.  Since  the  CPE  algonthm  uses  a  lot  of  set  theoreac 
operatioos  (such  as  union,  mtetsection,  XOR),  the  bit  vector 
data  representation  [21]  is  the  best  for  its  implementation.  The 
bit  vector  operation  minimizes  the  memory  used  and  helps  the 
program  run  efficiently.  A  path  (cut)  state  in  a  iwrwork  with 
I  links  requires  only  liub(//w)  k  w  bits  of  memory,  for  a  word 


size  w.  Considering  w  =  16,  a  path  P,  =  '2.4.6. 12. 1")  .s 
represented  as: 

word  1  word  2 

OOOOIOOOOOIOIOIOIOOOOOOOOOOOOOOOI 
msO  Isb  msb  isb 

For  l-bd  representaoon.  decoding/enooding  of  Hara  usually  requires 
I  tests  of  single  bits  as  ihe  worst  case.  Since  ui  general  not  all  the 
/  bss  ate  set  eventually  we  might  have  better  petfonmace.  espeoai- 
ly  when  most  of  ihe  set  bits  are  m  the  <iatTv»  words.  The  folloiving 
subsecaons  describe  the  details  of  the  steps  listed  to  the  CPE 
algonthm  (Section  4.1) 

A.I  Read  Files:  Decode  the  input  files  into  the  bt  veaor  represen- 
canon.  We  want  to  assorinte  a  path  with  a  certain  nmw  p,(i  =t 
and  airange  the  whole  paths  in  a  path  list.  The  capanty 
of  the  links  can  be  kept  in  an  anay  of  integer.  capacsr^Ll,  ^  ^ 
»  U . L 

A.2  IkJaBi  Comjxaanon:  When  the  network  always 

Ms.  The  1$  defined  as  min{W((;)}  using  (1).  The  wois- 
case  time  oootplexity  involved  is  0(1  x  n),  where  n  is  the  asnbo 
of  cuts. 

A.3a  Suooess/Faihne  determinatkn  of  paths:  It  is  Aw  as  foflows  ~ 
Sff  (ail  paths  P„)  { 
if  (pabjs^iaaiyiP,)  z 
Append  Pi  o  Sjks; 
dse 

Append  Pi  ID  Flseifl); 

1 

The  path  japaaty  ( )  function  wiQ  aaan  the  capacaty  of  a  path 
P,.  In  (2).  the  capaaty  of  a  path  is  given  as  miQ(aapaoiy(I,j). 
where  L,’s  are  UP  links  in  that  paih.  The  time  compfexity  for  A3a 
is  0(1  X  m). 

A.3b  Getting  keyjaa:  The  key_aff  concept  reduces  ifae  numbe  of 
states  enumerated  and,  dnis  reduces  the  ninmng  isne  of  ibe  CPE 
algoinhuL  Using  the  set  theoreac  operanons,  finding  a  keyjza  is 
snaighdorward.  The  algonthm  is  — 

if  (key_cuttsouroe  node  cut  Q)  or  kaiyjoiKtenninai  node  cut  C) ) 
let  the  mm  (|C,|,|Qi)  be  the  keyjsir, 
if  (C,  and  C,  are  not  keyjat)  { 
for  (all  cuts  C„) 

sdea  as  a  kty_far  the  min  (|key_ait{C:;)|); 

} 

The  ftmeaen  key_cnt(  )  determines  a  keyjsa  based  on  its  defim- 
Doo  m  Secooo  3.3.  The  tune  complexity  of  the  keyj3a(  )  fiinc- 
noo  is  0(0  and,  hence,  the  time  complexity  to  find  a  keyjsa  b 
0(1  X  n). 

A.3c  Pash^grouB  Generation;  When  a  keyjat  exists,  we  imend  to 
reduce  the  number  of  saates  gencraM  by  aeanng  based 

on  tbe  keyjtnks.  The  mne  congjteJtny  involved  is  0(1  *  S)  where 
d  is  the  number  of  Mure  snspie  paths.  It  is  ckxie  as  follows  — 


i50 


■  ’jcrcbz?.. 


’EEE  TTlANSACnONS  ON  RELOABIL-'T;'  '-OL  ''0  - 


for  (l  =  1  to  i)  { 
if  (i  €  key  Jinks)  { 
for  (all  CP, ( 1)  in  /^_set(l))  { 
if  a  €  CP,(U) 

let  CP.{  I )  be  a  member  of  group  G(i); 

} 

} 

} 

A. 3d  2-Composite  Path  Generation;  The  generadoa  of 
2-composif‘'  paths  directly  from  f^_set(  1)  is  straightforward. 
We  exhausnvely  create  the  combinanons  of  two  hulure  states. 
This  method  is  easy,  but  generates  many  redundant  states.  Use 
of  this  method  is  suggested  when  the  network  does  not  have 
a  keyjM.  Nonetheless,  a  conservative  technique  to  generate 
the  2-composite  paths  from  the  p<uh_groups  is  employed.  For 
a  source  or  terminal  node  cut  working  as  a  key^cut,  each 
member  of  a  certain  pcih_group  is  a  umque  path.  And  creating 
2-composite  path  is  also  unique.  Hence,  we  do  not  have  to  check 
for  redundant  states.  On  the  other  hand  the  members  of  the 
path^Tvups  generated  from  the  key  Jink  of  other  than  the  above 
keyjaus  is  not  unique.  In  such  cases,  it  happens  that  generated 
2-composite  paths  comprised  from  the  paths  which  are  the 
member  of  a  certain  group,  and  are  always  a  failure  state.  To 
help  reduce  the  tune  computaaon,  we  need  to  check  such  cases. 
The  time  complexity  of  this  step  is  O(fi^)  if  keyjM  does  not 
exist. 

A.4  ifc-composite  paths  generation  (Jt>2):  Here,  we  generate 
a  ik-clique  from  2  {!:- l)-cliques  of  Fjaik—l)  using  set 
theoreac  XOR  operaaon.  If  the  result  is  a  Link  in  PG<VJ), 
they  have  formed  a  k-clique.  Otherwise  the  2  (k-l  )-cliques 
fail  to  produce  a  ^-clique.  If  the  2  (i-  I  )-cUques  create  a  k- 
ciique,  get  the  A;-clique  by  taking  the  union  of  those  2  (k—l)- 
cliques.  Note,  we  may  keep  the  Fjexp)  in  lexicographic  order 
liO  speed  up  the  searching  process. 

•  Success/Failure  Determinahon  (for  CP/(Jk) ):  lemmu  2  is  in¬ 
corporated  to  determine  the  success/Mure  of  a  CP,  (k) .  The 
algonthm  is  — 

for  fall  cuts  C,,)  { 

if  (W{C?C,U))  <  {  /•(!)•/ 

append  it  to  F^seiik) ;  /•  check  for  redundancy  first 

break; 

} 

if  (CP^(k)  is  not  a  failure) 

append  it  to  5«sei;  /•  check  for  redundancy  first  */ 

} 

For  the  worst  case,  the  method  uses  all  cuts  to  help  determine 
a  success  CP,(k).  Thus,  the  tune  complexity  for  each  state 

CPiik)  IS  0(n). 

•  Cross  Jink, (k)  Generanon:  For  each  CP/(k),  crossjink, (k) 
IS  generated  by  taking  the  intersection  of  cross  Jink, (k~  I ) 
and  cross Jink„{k-\)  of  CP/(k-l)  and  C?g(k-\). 


respeenveiy.  Thus,  the  arae  complcxirv  for  each  cross^ 
link,(k)  IS  constant. 

APPENDIX  B 

B.l  A  counier  example  for  method  [10] 

Consider  figure  2  where  the  capaaty  of  each  link  is  the 
number  shown  m  parentheses.  Nine  simple  paihs  for  the  network 
are;  P,  =  (2.5.6),  P:*(3.7.8).  =  ( i.4,6.3),  P,=  (5.7). 

P5=(1.2).  P6=(4.3.8).  Pt=(3.4),  P,  =  (1.6.7),  and 
^9=  (2.3.6.8).  All  the  paths  represent  Mure  sates  for  = 
5.  .Now.  generate  (  ?  )  =  36  2-composite  paths  from  the  Mure 
simple  paths.  While  checking  for  the  successes,  one  obtains 
CP2j(2)  as  the  only  success  2 -composite  patns.  Since  P-  and  P5 
have  yielded  success  composite  paths.  [10]  eliminates  them  from 
she  simple  path  list  Seven  remaining  simple  paths.  P.,  Pj,  P^, 
f*6.  *re  then  utilized  ®  generate  s'  ^  )  =  35 

3-composite  paths.  Since  P5  and  P2  are  no  longer  in  the  simple 
path  list,  method  [10]  Ms  to  geneiaie  a  non-redundant 
3-composite  path  CPij,7(3).  Note.  (2Pij.r(3)  is  a  success 
3-composite  path  (refer  m  figure  2).  Obviously,  the  method  of 
[10]  does  not  work  in  general. 

B.2  A  counter  example  for  method  [13] 

For  the  following  counter  example,  we  use  the  nouQon 
in  [13].  Consider  figure  2  and  iQ  two  typical  paths: 
Pi  a  (2.5.6),  and  Pja  (3,7.8)  with  path  capac.nes  C|  =  I  and 
C2  =  2.  respectively.  Since  there  is  no  common  hnk  in  the  two 
paths,  we  use  the  modified  step  2  of  [13]  ®  evaluate  the  capaaty 
of  the  combination  of  paths  P,  and  P2(C,2).  FoUowuig  [13], 
we  define  the  vector  I'* [0  4  3  0  1  2  2  3],  and  initialize ;=  1 , 
and  C|2=0. 

for;=l.  x,  =  l;  and  l^=»[0  3  3  0  0  1  2  3]. 
for  ;  =  2,  X2  =  2;  and  /=(0  3  1  0  0  I  0  1], 

Therefore  C12  =  1  +  2  =»  3. 

However,  the  definition  of  flow  in  [18]  confirms  that  when  ail 
links  in  paths  Pi  and  Pi  are  good,  the  links  are  capable  of  car¬ 
rying  4  units  of  flow.  Hence,  the  modified  method  m  [13]  is 
imperfect  too.  Note,  the  proposed  method  presented  in  subsec¬ 
tion  3.1  gives  the  correa  capaaty  of  4  units  for  C,2.  It  is 
observed  that  for  two  or  more  disjoint  paths  having  at  least  one 
node  (other  than  the  source  or  tenmnai  nodes)  common  amongst 
themselves,  [13]  is  likely  ©  lead  ©  an  mconea  conclusion. 
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Abstract  •  In  the  sum  of  disjoint  products  (SDP) 
methods  of  network  reliability  analysts,  the  prqtroceasing 
of  minimal  paths/cuts  is  necessary  to  help  teduce  the 
number  of  SDP  terms  and,  hence,  the  overall  computation 
time.  Several  researchers  have  proposed  U  cardinality-. 
2)  lexicographic-,  and  3)  Hamming  distance  (l^!'  ordering 
methods  to  preprocess  the  path  terms  in  the  SOP  tech¬ 
niques.  For  cutsets,  an  ordering  based  on  the  node  parti¬ 
tion  associated  with  each  cut  is  suggested  [3].  Our  papc 
presents  experimental  results  showing  the  number  of  dis¬ 
joint  products  and  computer  time  involved  in  generating 
SDP  terms.  To  help  obi^  the  results,  we  have  considered 
19  benchmark  networks  containing  paths  (cuts)  varying 
from  4  (4)  to  780  (7376).  Several  SDP  techniques  are 
reviewed  and  are  generalized  into  three  propositiotss  to 
find  their  inherent  merits  and  demerits.  An  efficient  SDP 
technique  is.  then,  utilized  to  tun  input  files  of  paihs/cuu 
ptepiocessed  using  1)  through  3)  and  their  combinations. 
The  experimental  evaluation  has  been  performed  on  an 
FPS  SOO  system.  Finally,  results  are  analyzed,  and  it  is 
shown  that  the  preprocessing  based  on  cardinality  or  its 
combinations  with  2}  and/or  3)  performs  beuer. 

I.  Introduction 

Several  algorithms  dealing  with  the  terminai  relia¬ 
bility  evaluation  ire  proposed  m  the  literature.  These 
methods  can  be  classified  as:  state  enumencian,  decompo¬ 
sition  technique,  inclusion  -  exclusion,  factoring,  and  sum 
of  disjoint  products.  A  summary  of  these  techniques, 
including  their  relative  merits  and  demerits,  csn  be  found 
in  [Ij.'Noie,  all  methods  of  terminal  reliability  computa¬ 
tion  are  known  to  be  MP-hard  (21. 

The  sum  of  producu  technique  (SPT)  utilizes 
Boolean  concepts  to  change  a  path  or  cut  polynomial  tnio 
an  equivalent  sum  of  disjoint  products  (SOP)  expression 
and,  thus,  generates  the  reliability  parair,i:.ue;  of  a  given 
network.  Algorithms  based  on  this  principle  start  with  a 
Boolean  polynomial  formed  by  either  the  success  terms 
(minimal  paths)  or  the  failure  terms  (minimal  cuts)  for  a 
given  network.  To  help  reduce  the  gatcrated  SDP  terms, 
the  paths  or  cuts  are  sequenced  according  to  the  cardinal¬ 
ity  [S.IS],  and/or  lexicographic  ordering  following  the 
symbols  of  their  alphabets  [6].  Recently,  Wilson  (4]  has 
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suggested  to  choose  the  first  term  in  a  group  lexicographi¬ 
cally  and,  then,  sort  the  second  and  subsequent  terms  so 
that  the  number  of  variables  a  term  has  in  common  with 
the  proceeding  term  is  maximized.  Buzacott  (3]  proposes 
an  alternative  approach  to  order  the  minimal  cuts  based  on 
the  node  partition  associated  with  each  cuL  An  SPT. 
then,  converts  this  preprocessed  Boolean  polynomial  of 
paths/cuts  into  an  equivalent  SDP  form  that  represents  the 
disjoint  system  logic  [1-2.3,7-11, 13,15-17j.  The  disjoint 
products  for  any  (m-1)  size  path  Pi,  where  m  denotes  the 
number  of  nodes  in  the  graph  model  G(V.£),  is  obtained 
directly  by  mierseciing  the  complemotts  of  the  remaining 
(/-(m-D)  links  of  G{VJi)  with  Pi.  This  observation  (first 
made  in  (3)  and.  then,  ptorved  in  [10])  further  reduces  the 
computation  dme  for  algorithms  based  on  the  Boolean 
concepts.  Note,  an  SDP  expression  has  (xie  to  one 
correspondence  with  the  lysiera  probability  formula.  A 
drawback  of  the  algorithms  based  on  the  manipulation  of 
Boolean  sum  of  products  or  Lmplicants  is  in  the  iterative 
application  of  aeriiin  operations  and  the  fact  that  the 
Boolean  function  changes  at  every  step  and '  may  be 
chunsy.  Moreover,  the  Boolean  function  is  simplified 
using  ibtoipdon  rules  [18]  and.  thus,  requires  a  oonsider- 
able  computational  effort  [8],  Therefore,  most  SFTs  are 
applicable  only  to  small  to  moderate  sized  networks. 
Recently,  Soh  and  Rai  (7]  have  proposed  CAREL  (Com¬ 
puter  Aided  RELiabiUcy  evaluator),  a  rtew  algonthm  based 
on  SPT  concept  which  can  evaluate  a  large  netw«k  (wuh 
780  paths  and  7376  cuts)  in  less  than  a  minute  CPU  time 
(on  Encore  MULTIMAX  system)  with  modest  memory 
requiremenL 

All  preprocessing  techniques  reduce  the  number  of 
SDP  terms,  hence,  the  compuuuon  time  of  reliability 
analysis;  however,  there  is  no  unified  study  or  expenmen- 
tal  work  on  their  comparative  performances.  This  paper 
provides  experimental  results  to  help  compare  their  perfor¬ 
mances.  For  this,  we  have  used  19  small  to  large  bench¬ 
mark  networks  with  paths  (cuu)  from  4  (4)  to  780  (7376). 
We  consider  1)  cardinality-  .  2)  lexicographic-  .  3)  Ham¬ 
ming  distance-  ordering  methods  and  iheu  combinations  to 
prtprocess  paths/cuts  for  the  benchmark  nerworks.  The 
CAREL  [7]  is.  then,  utilized  to  obtain  the  SDP  terms  and 
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compuudon  time  for  generating  these  terms.  [In  fact,  my 
existing  Boolean  technique  may  be  useti  to  pci  form  the 
experiment.]  The  expoimenial  evaluation  is  performed  on 
an  FPS  500  system. 

The  layout  of  the  paper  is  as  follows  :  Section  U 
presents  preliminaries.  It  also  discusses  bit  vector  data 
representation  technique  which  has  been  used  to  program 
CAR£L  [7]  and  sorting  methods  utilized  in  preprocessing. 
Secnon  m  considers  a  generalized  view  of  SFTs  and  com¬ 
pares  their  basic  philosophies.  An  example  is  solved  to 
illustrate  their  concepts.  Section  IV  discusses  preprocess¬ 
ing  methods  and  provides  the  experimental  results  using 
19  benchmark  networks.  Note,  the  benchmark  networks 
ire  a  good  representative  of  small  to  large  graphs  because 
they  coiuain  paths  (cuts)  varying  from  4  (4)  to  780  (7376). 
Finally,  Section  V  ouiliitea  a  discussian  on  the  experimen¬ 
tal  results. 

IL  Prcluninarics 

2J  Background 

Consider  a  linear  graph  C(V,E)  model  for  a  rtet- 
work  where  V  denotes  the  set  of  nodes  and  £  represenu 
the  set  of  edges  of  the  network.  Note,  I V I  a  /n,  and  i  E I  > 
/.  Assume  C{V,E)  is  free  fitom  seif  loops  and  diiecied 
cycles.  Each  edge  has  two  tuues  ;  good  (UP)  or  bad 
(DOWN).  Nodes  «re  assumed  to  be  perfea  (imperfect 
nodes  can  be  conridered  following  a  method  given  in  [5]). 
Let  the  link  failures  be  sutistically  independent.  This 
assumpdon  is  useful  to  make  the  reliability  problem 
mathematically  tractable  [1]. 

A  mirtpath  Pi  is  a  path  &om  a  source  node  r  to  a 
terminal  node  t  in  G(V,E).  It  is  formed  by  the  set  of  UP 
edges  such  that  no  nodes  are  traversed  more  than  once. 
Paihsci  is  defined  as  the  set  of  minpaihs.  A  cut  is  a 
disconnecting  set.  All  conununicadon  between  a 
prescribed  (s,t)  node  pair  is  disrupted  once  the  edges  in 
(s.t)  cut  fail  We  define  a  mincui  as  a  cut  which  has  no 
proper  subset  that  is  also  a  cut.  and  cutset  as  the  set  of 
mincuts.  Assume  that  either  pathset  or  cutset  between  a 
source  s  and  terminal  r  in  G(V.E)  is  known. 

22  Data  Repraenteaion 

Dau  structure  is  an  important  aspea  of  designing 
efficient  algorithms  [12]  Rosenthal  [14]  has  discussed  the 
idvinuge  and  disadvantage  of  three  differoit  kinds  of 
data  representation.  This  secdon  describes  one  of  the 
representations,  namely  bit  vector  representatioa  which 
can  be  used  efficiently  on  some  of  the  Boolean  SDP  tech¬ 
niques  [7,10]  and  also  for  lexicographic  or  Hamming  dis¬ 
tance  [18]  ordering  discussed  later  in  the  texL 

A  minpaih/mincut  in  a  network  with  I  links  is 
represented  by  an  identifier  having  1  bits.  An  UP  link  of 


the  network  u  denoted  by  a  binary  I.  A  binary  0  stands 
for  a  don't  core  state  (not  a  DOWN  state).  Consider  the 
imnpaihs  a  b,  c  d.  a  d  e  and  bee  between  the  (s.t)  node 
pad  in  Figure  1.  These  mmpaihs  ue  stored  m  the 
memory  as  following  identifiers  :  (Leftmost  bit  is  the  most 
significant  bit) 

a b  ;  1  100000000000000 
cd  :001 1 000000000000 
a del  00  1100000000000 
bceOllOIOOOOOOQOOOO 

In  this  example,  we  have  utilized  the  word  size  w  as  16 
bits.  Note,  a  minpaih/  mincut  requires  fl/wl  words  of 
memory.  With  bit  vector  representation,  the  storage 
requirement  for  a  minpaih  (mincut)  identifier  depends  on 
the  total  number  of  links  in  the  network  and  not  on  the 
size  of  the  pathset  (cutset).  Coding  and  decoding  of  path 
information  into  bit  representation  or  identifier  and  vice 
vena  may  add  extra  cost  as  it  involves  t  bits  testing.  How- 
evs,  this  operation  (coding  and  decoding)  of  minptths/ 
mincuts  are  one  time  operaiioii.  They  are  usually  worth 
the  extra  computation  as  the  generation  of  disjoint  cvqus 
(and,  hence,  the  reliability)  requires  considerable  manipu¬ 
lations.  Moreover,  the  ability  of  bit  represemadon  in 
detecting  and  eliminating  redundant  terms  using  set 
theoreiic  operadoiu  like  union,  intersection,  subset  etc,  ’  is 
an  tmportant  advantage.  To  illustrate  the  concept  for 
redundancy  checking,  assume  the  reference  letmr  as  X  and 
a  test  tenen  XY  (which  is  a  teduieiant  subset  of  X).  Then 
do  the  foilowing: 

reference  (X)  110  0  1 

ten  (XY)  1110  1 

I  1  I  0  1  ;OR  operauon 
ten  (XY)  1110  1 

0  0  0  0  0  ;£OR  operation 

A  result  *00000’  shows  XY  u  redundanL  Similariy.  a 
duplicaie  term  is  detected  and  deleted.  Thus,  the  set 
theoretic  operations  ve  easy  u>  implement.  Note,  the 
computation  dme  is  independent  of  the  size  of  the  net¬ 
work.  The  number  of  links  /,  which  affects  the  speed  of 
the  set  operations,  increases  the  computation  dme  by  one 
unit  for  every  w  addiuonai  links. 

The  bit  vector  represenuiion  can  be  used  efficiently 
for  lexicographic  sorting.  SirKC  the  alphabets  of  our  sym¬ 
bols  are  mapped  into  bit  representation  from  most  to  least 
significant  bit.  the  lexicogpraphic  sorting  can  be  achieved 
simply  by  sorting  the  integer  represenudon  of  the 
paihs/cnu  in  decreasing  order.  Note,  the  Hamming  dis- 
tnee  [18]  between  two  paths  (cuts)  is  defined  as  the 
number  of  bits  in  which  their  idendfiers  differ.  We  have 
also  used  bit  vector  representadon  to  implement  Hamming 
(Ustance  ordering. 
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HI.  Sum  of  Disjoint  Hroducts  Technique 

3J  BooUan  Techniques  Concept  [7] 

Boolean  techniques  of  reliability  evaluation  start 
with  a  sum  of  products  expression  for  paihsets  or  cutsets 
and  convert  it  mto  an  equivalent  sum  of  disjoint  products 
(SDP)  expression.  In  the  SDP  form,  an  UP  or  logical  suc¬ 
cess  (IX)WN  or  failure)  state  of  a  hnk  x  is  replaced  by 
link  reliability  p  (unreliability  q),  arxl  the  Boolean  sum 
(product)  by  the  arithmetic  sum  (product).  In  other  words, 
the  SDP  expression  is  interpreted  directly  as  an  equivakru 
probability  expression  of  terminal  reliability.  If  F; 
represents  a  path  identifier  (an  UP  state  of  a  link  in  a  path 
Pi  has  1  in  Fi,  while  a  elon't  care  is  represented  by  0), 
the  sum  of  products  expression  F  is  given  by  : 


Table  1. 

OPt  through  OP)  used  with  SDP  techniques 


Operator 

Funedon 

Ref. 

OP) 

Cutset  Disjoini  Procedure 

tlO] 

OP) 

S  operator 

111] 

OP) 

E-operator 

[5] 

OP) 

COMPARE!  )  function 

[15] 

OP) 

CMB  (  •  )  operator 

[9] 

OP) 

Boolean  negation 

[17] 

OP) 

Relative  complement.  Procedure  1 

[13] 

OP). 

method  1,  CMB  (  *  )  operator 

[7] 

OP) 

method  2.  CMB  (  *  )  operator 

[7] 

1*1 

where  n  denotes  the  number  of  minpaihs  (mincuts) 
between  (s,t)  ruxic  pair  in  G(V,E).  Note,  F,*i  in  (1)  a« 
preprocessed  (ordered)  following  any  methods  in  [3-6]. 
Equation  (1)  is  modified  either  canonically  or  emuerva- 
tively  to  generate  the  equivaleru  SDP  expression, 
F(disjoint).  The  consov  alive  modification  is  usually  pre¬ 
ferred.  since  it  is  more  efficient  compared  with  canonical 
modification,  where  2*  events  are  required  to  determine 
F(disjoint).  [1  is  the  number  of  links  in  the  network.)  A 
simple  way  to  generate  the  mutually  disjoint  events  in  (1) 
is  as  fallows: 

F)  +  Fi  +  F)  F)  Fs  +  +  F,  F\  fj  _  F,-\ 

where  Fi  denotes  DOWN  events  of  path  P-,.  The  proba¬ 
bility  of  UP  (operadoiui)  for  an  itA  term  Fj  Fi  Fj^  FZi 
can  be  evaluated  using  conditional  probabiliiy  and  stan¬ 
dard  Boolean  operations  as  ; 


Pr  (F. ) .  Pr  (FT  •  ^ ft)  -  Pr  (F.  )  U  Pr  (E,) 

Here,  an  £/  represents  a  conditional  cube  (lOJ  and 
defines  condidoru  for  a  path  identifier  Fj  DOWN  given  Fj 
UP  (operational).  The  probability  of  the  fint  event 
Pr  (F;)  can  be  determined  in  a  straightforward  manner 
since  the  failures  are  asstimed  to  be  tuiisdcally  indepen- 
denL  However,  the  coefficient  Pr  (£/)  requires  further  ; 
consideration  since  various  terms  within  £/‘s  will,  in  gen-  ■. 
eral.  be  not  disjoint  (!]■  This  necessitates  E/’s  to  be  made . 
mutually  disjoint  before  we  generate  the  equivalent  proba¬ 
bility  expression.  ^  'f—  .-H, -  ' 

..  Various  researchers  [1]  have  worked  on  this  philo- 
sophy  and  have  proposed  methods  to  generate  a  disjoint 
expression  for  (F,,  Ft)  pairs  in  (IX  and  also  the  (£,,  £,} 
terms  within  an  F,-.  Following  three  propositions  P 1 


through  P  in  that  convert  F  into  F(dtsjoini)  reprv"_ni  basic 
prindples  behind  most  Boolean  SDP  methods  in  the  liieia- 
tame : 

Proposition  P  L  The  proposidon  P  I  defines  iniermedi- 
aie  teim(s)  Ti  ‘s  as  ; 

i-i 

r,  *  F^  I  ^  smw  m  ^-*1  (2) 

!•» 

where  F'  »  Fj  and  F'  »  F;  OP)  Ti.  Here,  F'  refers  to 
the  equivalent  disjoini  product  icniKs)  for  Fi.  The  opera- 
don  ’OP)'  is  a  necessary  disjointing  operator.  (  Table  1 
lists  various  operators).  The  ^(disjoint)  expression  is. 
then,  given  by  : 

F  {dii joint  )  =  \j  F^  (3) 

/ 

Algorithms  [9]  and  method  1  in  [7]  proposi¬ 

tion  P  L 

Propositloa  P  IL  For  each  term  Fi.  I  <  i  i  n,  Ti  is 
defined  to  be  the  union  of  cQ  predecessor  terms 
F),  Fj, ..,  F,-.|.  in  which  any  literal  that  is  present  in  both 
F|  and  any  of  the  predecessor  terms  is  deleted  &am  those 
predecessor  terms.  Le. , 

(-1 

Tl-l^Fj  ^  mdt  rnmml  0/ W 

Consids  F*  =Fi,  and  define  F*  ■  F(  OPiTi.  Equation 
(3X  then,  obtains  the  equivalent  Ffdisjoint)  expressioa 
Hariri  and  Raghavendia  (10],  Rai  and  Aggarwal  (S],  Ben¬ 
netts  (13],  and  method  2  in  Soh  and  Rd  (7]  have  based 
their  techniques  on  proposition  P  IL 
Propositloa  P  IIL  For  IcJ'Sa  .  use  operation  ’OP)’  to 
perform  :  ■'  ^ 

/!.  » OF,  F,) OPsFjjOPsZ.)  OP)  F/.i  (5) 

Eqtiadon  (5)  obtains  a  set  of  ifisjoint  cubes  [11] 
cotrespwnding  to  F/.  Note.  F'  =  F,.  end  OP)  represents 
in  appropriate  disjointing  operator.  The  F(disjomt) 
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exprcsjion  is,  then,  given  by  equeuon  (3).  Abnh*m  [15]. 
Gnuurov  euii  [11],  md  Tiweri  and  Vcnni  [17]  hive  pro- 
poaed  their  methods  using  P  m  concept 
Example  To  lilustrue  propositions  P  I  through  P  IIL  con-  . 
sider  a  network  shown  in  Hgure  4.  For  the  (i.t)  node  pair.  _ 
there  exists  13  minpaths  :  ad/i,  bth.  bfi,  acek,  acfi,  odgi, 
bcdh,  begi,  bfgk,  acegi,  acfgh,  adefi,  and  hcdgt.  In  what 
follows,  we  explain  steps  to  generate  the  exclusive  and 
mutually  disjoint  evenl(s)  or  cube(s)  for  Ft  (*  acek).  This 
is  demonstrated  using  typical  methods  for  propositions  P  I 
through  P  HL  Fcr  uniformity,  we  keep  the  noudon  given 
in  [5],  in  which  an  JT  denotes  the  down  sute  of  the  link  x. 

P  I  :  Considering  CAREL  method  I  [7],  F*  ■  adk,  ■ 
bek  oZr,  and  F*  *  bfi  (k  k  T  a?).  Use  equation  (2)  to 
define  T*  as  (d  -f  b)  for  F*.  Replacing  OPi  with  CMB 
operator  [7],  we  get  CM1](T ^  u  J  F.  F*  is  then  obtaiised 
as  :  F*  “  F«  CMB  (T*).  For ;  *  5.  _  13,  the  toms  F^'s, 
are  generated  similarly.  An  expression  for  F(disjoini)  is  : 

F(disjoint)  ■  adk  *  bek(a3)  bfi(F  hi  aS) 

+  aetk  {J  F)  *■  acfi  (,F  F  +  FkJ  £) 

+  adgi {F  f  *  FfF  5)  +  bcdk (S  I  fi) 

+  begi  (F  f  FS)  ■i-  bf gh(e  T I  {  T  d  Si) 

+  acegi  (F  f  J  5”)  +  acfgh  (F  F  e  T) 

+  adefi(F  S  f  F)-*-  bcdgUS  I  f  F). 

P  n  :  We  have  used  E-operaior  [S]  to  explain  the  opera¬ 
tion  OPi  and.  hence,  the  concept  behind  proposidon  P  Q. 
The  terms  F‘  ■  odK  F*  ■  bek  (,S  *  aJ),  and  F*  ■  bfi 
(,F  *  k  Si  ■¥  ha  I S)  are  computed  using  [S].  To  gen¬ 
erate  F^  an  iiuermediite  term  Tt  is  obtained  as  : 

Tt  •  adk  +  bek  ■¥  bfi  I  ~  d  b  *  bfi,  where 

the  last  terra  is  redundant  and  can  be  deleted.  Subsdtuting 
OPt  with  E-operaior,  E  (Tt)  i»  F  F.  Hence, 
F*  =  F*  E  (TJ.  Similarly,  we  obtain  F^’s  for  y  «  5, 

13.  Equadon  (3)  is  ; 

F(disjoint)  =■  adk  +  bekijaHiF)  +  bfi(,F*k  S  S-^ha  F  f) 

+•  acekib  F)  +•  acfiiF  F  *  F  h  F  e) 

+  adgi  (F  f-^F  f  FD-h  bcdk  (S  S  f+S  JfT) 
begHF  f  a+^  JcZ)  +  bfghijS  T  J+e  idS  c) 
acegi  {F  f  F  F)-¥  acfgh  (F  F  e  T) 

*  adefi (b  f  g  h)  +  bcdgi{a  i  J  F). 

P  m  :  Use  reference  [IS]  to  obtain  the  terms  F'  ■  adk, 

F*  »  bek  (S  a  F)  and  F^’*bfi(iSS'¥SeF*aFi 
■^aFeF-*-adF).  Here,  the  COMPARE  fiincxion  [IS] 
subsdtutes  OP  ].  Then. 

F*  -  ((^F«  OF,  F,  )  OF,  F»  )  OF,  Fa). 

The  irtner  term  F*  OF,  F,  gives  aceh  F,  which  with  Fi 
generates  aceh  F  F.  Finally,  the  OF,  for  the  outer  term 
gives  the  result  as  acek  F  F.  Similarly,  compute  F^  for 
Ff  (/  *  S,  _  .  13),  Equaaon  (3),  then,  gives  ; 


F((Hsjoim)  a  adk  bek  (a -hi  F)  +  bfi  (a  e+def+aJ  e 
*a  F  e  h-*ad  h)  *■  acek{F  F)  +  acfi{F  F  ? 

■^F  F  e  F-*dh  F)  *  adgi  {F  F  c+F  F  c  f*Fbf) 

■  +  bcdkd  3  r*r  SfP)  *  begi  (SFf-nsFFf) 

.  .+  bfghiS  STS+S  i  TcF-HtF  I  T) 

.  j  f  acegi  (F  F  F  +  acfgh  (F  F  i  T)  . 

.,>1  adefi[F  Fig)  bcdgUS  e  f  F). 

Note.  Ffdisjoim)  expression  obtained  &om  difTereni  propo¬ 
sitions  when  expanded  out  should  be  identical.  For  the 
network,  the  terminal  reliability  is  0.977184  when  each 
link  is  assumed  to  have  a  reliability  of  0.9. 

i  J  Existutg  SDP  Techniques  -  A  Comparison  [7] 

Propositions  P  I  through  P  HI  maintain  the  min¬ 
paths  or  mincuis  list  in  memory  (equadon  (1))  .  Coruider 
1  for  UP  Unk  arsd  0  for  don't  care,  and  udlize  bit 
representation  technique  [discussed  in  Section  2.2].  The 
meriMry  requirement  is,  then.  f/Zwl  words  per  path  (cut), 
where  /  is  the  number  of  links  in  the  network  G(V.£). 

i-i 

Proposidon  P  I  makes  F,  disjoint  with  respect  to  Ff 

j-i 

i-i 

while  proposidont  P  H  and  P  HI  udlize  F; .  Should 

j*>t 

we  have  similar  operadons  to  implement  equations  (2)  and 
(4X  the  proposition  P  I  will  requi'e  more  operations  than 
that  needed  for  P  H.  Generally,  an  F,  generates  more  than 
one  SDP  terms  F’  's.  Hence,  Ae  numb«  of  terms  involvad 
int^  F^  is  Isrgei  thsn  that  in  v_.)  Fy.  For  example.  Table 
I  ) 

4  (fliS*)  shows  results  for  i  »  780.  The  number  of  terms 
in  equsQon  (2)  is  more  than  50,000  ;  on  Ac  otha  hand 
equation  (4)  needs  exictly  (i-1)  i.e.,  779  terms.  Note,  A 
ptoposition  P  L  the  generated  SDP  terms  have  to  be  kept 
A  the  memory  to  implement  equation  (2),  which  is  not  Ae 
case  for  P  n  or  P  HL  This  makes  propositAn  P  I  sequen- 
tuL  Moreover.  P  I  demands  a  huge  memory  space  to 
evaluate  a  large  network.  On  Ae  oAer  hand  P  Q  or  P  HI 
has  im{dicit  paralleiism.  maJtAg  it  easier  for  Ae  program¬ 
mers  to  implement  Aem  on  parallel  systems.  Overall  Ae 
proposidon  P  Q  or  P  HI  provides  advantages  A  com¬ 
parison  wiA  P  L 

An  analysis  of  performance  comparison  between  a 
typical  example  of  proposition  P  H  and  P  HI  is  discussed 
A  [10].  SYREL  [10],  an  implementation  technique  for  E- 
operaior  [S],  is  shown  to  have  better  performance  A  com¬ 
parison  wiA  {-operator  [11].  It  means  proposidon  P  H 
outpeiforms  P  HL  Moreover,  proposition  P  H  offers  a 
faster  impiementaiion  approach  Aan  Ait  A  P  I  or  P  HI. 
The  At  vector  implementauon  of  Fy  makes  Ae  realiza- 

i 

tion  of  equation  (4)  w  (word  size)  dmes  faster  Aan  gen- 
eraiAg  equation  (2)  based  on  proposition  P  I.  A  what  fol¬ 
lows,  we  use  CAREL  [7]  to  perform  our  experiment. 
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Table  2. 

Preprocessing  results  for  paths  in  Figure  4 


Random 

H 

L  1  C 

C+H 

C*L 

adh 

adh 

acegi 

bfi 

adh 

adh 

adgi  1  bedh 

aceh  1  beh 

beh 

beh 

adefi 

acfgh 

acfgh  1  adh 

bfi 

bfi 

aceh 

aceh 

aefi 

bfgh 

adgi 

aceh 

acegt 

adefi 

adefi 

begi 

aefi 

aefi 

acfgh 

adgt 

adgt 

bedh 

begi 

adgi 

aefi 

bfgh 

adh 

aefi 

aceh 

bedh 

bedh 

beh 

bedgi 

aceh 

bedh 

begi 

bedgi 

bedgi 

bedh 

adgi 

begh 

bfgh 

beh 

aefi 

begi 

bedgi 

adefi 

acegi 

begi 

acegi 

beh 

acfgh 

acegi 

acegh 

bfgh 

bfi 

bfgh 

acegi 

acfgh 

adefi 

bfi 

begi 

bfi 

adefi 

bedgi 

bedgi 

C  >  Cardinality  ordering 
H  >  Hamming  distance  ordering 
L  *  Lexicographic  ordering 


IV.  Preprocessing  of  Path  /  Cut  terms 

Various  researchen  [1]  have  pointed  out  the  need 
for  preprocessing  (ordering)  of  paths/cuts  to  help  reduce 
the  number  of  generated  SDP  terms,  artd  hence,  the  com- 
puiadon  time.  References  [S.IS]  sequence  the  minimal 
paths  in  the  order  of  their  increasing  cardinality.  For  each 
group  of  terms  of  the  same  size.  Locks  [6],  further,  sug¬ 
gests  to  sort  the  terms  lexicographically  following  the 
order  of  alphabets  used  to  represent  the  paths/cuts. 
Recently,  Wilson  [4]  has  modified  the  preprocessing  [6] 
by  incorporating  distance  concept.  The  distance,  here, 
means  the  number  of  variables  a  term  has  in  common 
with  a  reference  term  which  is  chosen  within  a  group  lexi¬ 
cographically.  Logically,  we  may  call  the  distance  as 
Hamming  distance  [18]  if  we  have  binary  representation 
for  two  terms  in  question. 

4J  Preprocessing  Methods 

This  section  presents  the  concept  of  various  prepro¬ 
cessing  methods  which  are  used  to  order  path/cut  terms  in 
an  SPT.  We,  primarily,  utilize  the  following  ordering 
approaches  to  generate  our  experimental  results : 

1}  increasing  Hamming  distance. 

2)  lexicographic. 

3}  increasing  cardinality  [S.IS], 

4)  increasing  csordinality  and  lexicographic  [6],  and 

5)  increesing  cardinality  and  Hamming  dUunce. 

Note,  approach  3)  is  similar  to  the  one  suggested  in  [4] 
except  for  the  fact  that  we  pick  the  reference  term  in  a 


group  randomly.  Refcrcjxre  [4j,  on  the  oiha  hand,  chooses 
a  uum  based  on  lexicographic  ordering. 

Table  2  presents  the  minimal  paths  of  the  network 
shown  ui  Figure  4.  and  also  the  results  of  preprocessing 
the  paths  based  on  the  methods  I)  through  5).  Imiiaily, 
paths  generated  for  the  network  are  in  random  order  as 
shown  in  column  1  of  Table  2.  For  the  Hamming  distance 
ordering,  we  arbitrarily  pick  the  first  term  as  the  reference. 
Then.  we  sort  the  paths  based  on  the  increasing  Hamming 
distance  from  the  reference.  The  terms  which  have  the 
same  Hamrning  distance  can  be  in  any  order  among  them¬ 
selves.  The  result  for  this  scenario  is  shown  in  column  2 
of  Table  2. 

We  use  alphabets  (l.Z  -  .  0  denote  a  path/cai  of 
a  network,  where  I  denotes  the  number  of  links  in  the  net¬ 
work.  Label  evh  edge  with  a  distinct  alphabeu  Then, 
sort  the  paths  lexicographically  [12].  The  result  of  this 
type  of  preprocessing  for  the  paths  in  Figure  4  is 
presented  in  column  3  of  Table  2.  The  ordering  using  car¬ 
dinality  is  mentianed  in  [S.IS].  Note,  the  terms  with  the 
same  cardinality  ire  ordered  in  any  sequence  (refer  to 
column  4  of  Table  2). 

Gilumn  S  of  Table  2  shows  the  result  of  prepro¬ 
cessing  the  paths  of  Hgure  4  using  the  combination  of 
cardinality-  md  Hamming  distance-  ordenngs.  For  this 
approach,  terms  of  the  same  size  are  sorted  out  based  on 
Hamming  ifisunce  from  the  first  term  in  the  group.  Note, 
the  reference  (or  the  first  term)  in  a  group  is  chosen  ran¬ 
domly.  It  difTerentiales  our  method  bom  [4],  where  the 
first  term  is  chosen  by  lexicographic  ordering.  Our  order¬ 
ing  method  is  computationally  simpler  than  that  in  [4]  and 
it  does  not  hinder  SPT  ooncepc  [The  preprocessing  time 
should  be  less  thin  S-IO  percent  of  the  overall  reliability 
computation  time.] 

Hnilly,  we  consider  to  preprocess  the  paths  based 
on  both  the  cardinality-  and  the  lexicographic-  ordering 
[6],  We.  first,  order  the  terms  in  their  increasing  cardinal¬ 
ity.  Then,  for  the  toms  within  the  same  cardinality 
group,  use  lexicographic  sorting.  The  result  of  this 
preprocessing  is  provided  in  the  last  column  of  Tabic  2. 

42  Experimental  Results 

Once  we  have  preprocessed  the  paths/cuts  of  a  net¬ 
work  using  approaches  1)  through  3)  mentioned  in  subsec¬ 
tion  4.1.  we  generate  the  SDP  terms  (and  the  terminal  reli¬ 
ability  for  the  network)  using  an  SPT  CAR£L  [7].  Table  3 
lisu  the  19  benchmark  networks  that  we  have  udlized  for 
OUT  experiments.  The  table  als^  illustrates  reliability 
(unreliability)  values  assuming  thai^jch  network  has  link 
reliability  (unreliability)  of  05  (0.1).  We  denote  each  of 
the  benchmarks  vrith  noudon  Nj.  where  the  subscript  j 
(superscript  0  represents  the  total  number  of  mmpaths 
(mincuts)  for  the  network.  Note,  j  (0  varies  from  4  (4)  to 
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Table  3. 

Benciunark  Networks  used  In  experiments 


Na 

Network' 

R(cr 

Q(cr 

Conunctaj 

1 

Nt 

0.978480 

0.021320 

H(.  1  Biidge  oetwoik 

2 

SI 

0.968423 

0.031375 

Hg.  2  6-node.  8-link  oetworic 

3 

Nl 

0.997632 

0.002368 

Fig.  3  S-node.  8-liak  neiworic ' 

4 

SU 

0.977184 

0.022816 

Fig.  4  Modilieti  ARPANET 

5 

0.964853 

0.035145 

Hg.  3  ARPANET  in  1971 

6 

sil 

0.996663 

0.003336 

Pig.  6  7-node.  13-link  network 

7 

0.994076 

0.005924 

Hg.  7  llHiode.  214ink  network 

S 

Sit 

0.969112 

0.030888 

Hg.  1  94ode,  13-link  neiwoit 

9 

Ni! 

0.973116 

0.024884 

Fig.  9  Modified  ARPANET 

10 

SS 

0.984068 

0.013932 

Hg.  10  Hg.9  with  different  (ijt) 

It 

ss 

0.997494 

0.002306 

Hg.  1 1  7-node.  12.1ink  network 

12 

ss 

0.996217 

0.003783 

Fig.  12  8-node,  13-iink  network 

13 

sS* 

0.997186 

0.002814 

Fig.  13  16Hiode,  30-link  network 

14 

Ni? 

0.904577 

0.095423 

Fig.  14  ARPANET 

13 

ss 

0.974143 

0.023833 

Hg.  IS  reduced  fom  of  Hg.l4 

16 

sit 

0.997306 

0.002494 

Hg.  16  10-node,  21 -link  network 

17 

sar 

0.983928 

0.014072 

Hg.  17  ARPANET 

1$ 

S^t 

0.987390 

0.012610 

Fig.  18  reduced  form  of  Hg.  17 

19 

0.997120 

0.002880 

Fig.  19  20-node.  30-link  netwock 

*  N\  means  a  netwoik  with  ;  paths  and  (  cuts 

**  Tenninai  reliability  (unreliability)  values  are  for  link  reliability  (unreliability)  ■  0.9  (0.1) 


780  (7376).  We  jeneraie  the  reliability  values  of  the  net¬ 
work  from  the  paihsets.  and  the  tmreliability  results  from 
the  cutsets.  As  expected  the  sum  of  the  reliability  and 
unreliability  values  for  a  network  is  1. 

Table  4  ptesena  the  experimental  results  by  run¬ 
ning  CAREL  [7]  with  random  artd  preprocessed  lists  of 
paihsets  for  the  benchmark  networks.  The  table  provides 
the  results  in  terms  of  total  number  of  disjoint  piths  and 
the  computer  dme  involved  in  generating  SDP  terms. 
Note,  the  computer  time  is  in  CPU  seconds,  and  the  'O' 
second  represents  a  time  less  than  0.1  second.  These 
results  are  generated  using  an  FPS  SOO  system. 

Table  5  is  obtained  similarly.  It  shows  the  experi¬ 
mental  evaluations  for  random  and  preproceased  lists  of 
cutsets.  To  help  evaluate  the  efSciency  of  a  preprocessing 
technique,  we  have  used  the  number  of  disjoint  cuts  and 
computer  dme  consumed  to  generate  them  for  each  bench¬ 
mark  network. 

V.  Discussion 

Tables  4  and  5  suggest  that  Hamming  distance  and 
lexicographic  preprocessings  do  not  reduce  the  complexity 
of  SPT  for  network  reliability  evaluation.  Both  types  of 


preprocessing  may  or  may  not  reduce  the  number  of  dis¬ 
joint  paths/cuis  terms  and  also  the  CPU  dme  compared  to 
the  ones  involved  for  the  randomly  ordered  padi/cut  terms. 
The  basic  idea  in  having  a  Hamming  distance-  or  a 
lexicographic-  ordering  is  to  take  advantage  of  overlap¬ 
ping  link  variables  in  the  paths.  But.  for  large  networks, 
since  there  are  so  many  permutadons  of  variables  involved 
in  the  ordering  of  the  paihs/cuts.  it  is  likely  that  the  Ham¬ 
ming  distance-  and  lexicographic-  orderings  take  advan¬ 
tage  of  the  overlap  only  on  the  early  tuiges  of  the  compu¬ 
lation.  For  the  larger  part  of  the  evaluation,  we  may  or 
may  not  have  a  belter  overlapping  variables  compared  \o 
the  random  order  of  paihs/cuts.  This  reasoning  explains 
our  results  (refer  uj  Tables  4  and  S). 

On  the  other  hand,  cardinality-based  preprocessing 
(either  cardinality  ordering  only  or  the  combinadons  with 
Hamming  distance-  or  lexicographic-  ordering)  shows 
tigntCcant  improvements  over  the  other  techniques.  For 
larger  networks,  it  is  shown 'm  Tables  4  and  S  that  both 
the  number  of  disjoint  paths/  cuts  and  the  CPU  time 
required  to  compute  them  are  dramadcaily  reduced.  The 
results  prove  the  importance  of  cardinality-based  prepro¬ 
cessing  (to  help  reduce  the  time  complexity  of  SPT  for 
network  reliability  problem).  Note  for  path-based  SPT,  the 
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cardinaiicy-based  preprocessings  may  also  take  iunher 
advantage  of  the  fact  that  the  disjoint  term  fcr  a  path  Pi 
having  (m-1)  links  is  generated  simply  by  inierseconi  the 
complements  of  (he  remaining  (/-f/n-f)}  links  of  the  net¬ 
work  with  P;  [S.IO].  This  fact  significantly  reduces  the 
CPU  time  of  reliability  evaluadon  for  large  netwmks.  But 
the  method  does  not  apply  for  non-cardinality  based 
preprocessing  techniques.  Moreover,  (his  ofaservatian  is 
hardly  appiicabie  for  any  preprocessing  method  used  on 
cut  terms. 

While  comparing  the  cardinality-based  techniques, 
we  notice  in  Tables  4  and  S  that  the  number  of  disjoint 
paths/cuts  and  the  CPU  times  required  from  inputting  the 
preprocessed  paths/cuts  to  CAREL  [7]  are  of  the  same 
order.  Adding  Hamming  distance-  or  lexicographic'  ord¬ 
erings  may  or  may  not  reduce  the  disjoint  paths/cuis  gen¬ 
erated  nor  the  CPU  time  consumed  to  generate  them.  For 
large  networks,  the  results  can  be  easily  explained  £rom 
the  fact  (hat  neither  lexicographic-  nor  Hamming  disunce- 
orderings  give  any  berwfits  over  random  ordering  (as 
explained  earlier).  Note  in  large  networks,  in  general, 
there  are  a  lot  of  terms  which  are  of  the  same  cardinality. 
Thus,  ordering  the  terms  in  each  group  based  on  lexico¬ 
graphic  or  Hamming  disiaitce  does  not  provide  significant 
advanuge  over  random  ordering  of  the  terms  of  the  same 
size. 

The  paper  has  presented  the  experiment^  resttlis 
for  the  performance  comparison  of  five  difTcreni  prepro¬ 
cessing  approaches  of  paths/cuts.  For  path,  we  may 
definitely  conclude  that  the  cardinality-based  preprocessing 
(cardinality  only  or  its  combinations  with  iexicographic- 
and/or  Hamming  distance-  orderings)  is  wordi  considering 
with  SPTs  as  it  reduces  (he  computer  time.  For  cut-based 
SFTs,  we  could  not  get  the  results  for  the  preprocessing 
method  proposed  in  [3].  We  conjecture  the  meilwd  (3]  is 
quite  involved  and  its  time  complexity  to  preprocess  the 
cuts  is  magnitudes  higher  than  the  ones  we  have  con¬ 
sidered  in  this  paper.  Considering  this  fact  in  view,  we 
derive  the  cardinality-based  oidcring  schemes  to  be  a  best 
bet  even  for  cuts  too. 
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1.  Introduction 

To  achieve  the  faster  computing  speeds  imperative  for  many  computer  applications,  the  use 
of  multiple  processors  operating  in  parallel  is  a  necessity.  Consequently,  the  reliability  of  the 
network  interconnecting  these  processors  is  of  notable  imponance.  For  general  networks,  or  e 
may  define  several  reliability  measures.  Terminal  reliability  is  the  probability  that  a  working  path 
exists  from  one  specified  node  to  another  specified  node;  K-terminal  reliability  is  the  probability 
that  a  working  path  exists  from  one  specified  node  to  each  of  a  specified  set  K  of  nodes;  broadcast 
reliability  is  a  special  case  of  K-terminal  reliability  in  which  K  is  the  set  of  all  output  nodes.  For 
general  networks,  the  problems  of  terminal,  broadcast,  and  K-terminal  reliability  evaluations  are 
computationally  intractable,  specifically,  #F-complete.  For  some  particular  networks,  though,  the 
network  offers  sufficient  structure  to  allow  efficient  evaluation  of  reliability. 

Multistage  interconnection  networks  (MINs)  are  a  widely  studied  means  of  interconnecting 
processors  to  memory  or  processors  to  processors  by  stages  of  switches.  Experimental  systems 
also  increasingly  use  MINs.  Such  large  scale  projects  as  IBM’s  RP3,  the  University  of  Illinois’ 
Cedar,  Purdue  University’s  PASM,  and  PUMPS  possess  MINs  as  an  integral  part  of  their  design. 
A  MIN  consists  of  N  inputs  and  N  outputs  and  (typically)  n  stages  of  switching  elements  (SEs), 
where  n  =  log2iV.  The  most  common  type  of  MIN  uses  as  SEs  2x2  crossbar  switches,  which  are 

able  to  produce  either  a  straight  (T-mode)  or  exchange  (X-mode)  connection. 

Because  of  appealing  properties  such  as  node  and  edge  symmetry,  logarithmic  diameter, 
high  fault  resistance,  scalability,  and  the  ability  to  host  popular  interconnection  networks,  such  as 
ring,  torus,  tree,  and  linear  array,  hypercube  multiprocessors  have  been  the  focus  of  many 
researchers  over  the  past  few  years.  This  topology  has  resulted  in  several  experimental  and 
commercial  products,  typical  examples  being  the  Intel  iPSC,  NCUBE/10,  Caltech/JPL,  and  the 
Connection  Machine.  Conceptually,  the  hypercube  interconnection  network  is  a  multidimensional 
binary  cube  with  a  processor  at  each  of  its  vertices.  An  n-dimensional  hypercube  comprises  2” 
processors  and  n2"~'  links.  Each  processor  has  its  own  local  memory,  and  processors 


communicate  by  explicit  message  passing  either  directly  or  through  some  intennediate  processors. 
A  subcube  is  a  substructure  of  a  hypercube  that  preserves  the  propenies  of  the  hypercube. 

In  the  work  supponed  by  Air  Force  Office  of  Scientific  Research  grant  AFOSR-91-0025, 
we  have  examined  reliability  evaluation  problems  for  MINs  and  hypercubes.  The  main  aims  of  the 
work  have  been  twofold — first,  to  obtain  efficient  reliability  evaluation  algorithms  for  MINs  and 
hypercubes,  and  second,  to  incorporate  dependent  and  multimode  failures  into  these  algorithms. 
The  method  underlying  our  research  has  been  to  employ  a  structural  approach  for  studying  the 
reliability  of  multiprocessing  systems  based  on  multistage  interconnection  networks  and 
hypercubes.  We  selected  this  approach  for  two  reasons.  First,  the  regular  structure  of  the 
topologies  studied  allows  efficient  solutions  tocenain  difficult  reliability  measures.  Second,  the 
approach  yields  a  framework  in  which  incorporating  dependent  and  multimode  failures  is 
straightforward.  For  our  work  this  second  year  of  the  project,  we  concentrated  on  new 
computational  procedures  for  calculating  reliability  measures  (exactly  and  approximately)  in  MIN 
and  hypercube  based  multiprocessing  architectures.  Sections  2  and  3  provide  the  results.  A 
bibliography  at  the  end  of  this  report  compiles  our  research  papers  on  the  topic.  Appendix  A 
summarizes  results  obtained  in  the  first  year  (October  1 ,  1990  to  September  30, 1991).  We  believe 
that  our  efforts  will  help  the  AFOSR  in  the  areas  of  analysis,  modelling  and  simulation  of  these 
multiprocessor  systems  connectivity,  survivability  and  availability  (with  or  without  multimode 
component  and  dependent  failure  models). 

In  addition  to  the  efforts  of  the  P.I.,  S.  Rai,  and  co-P.I.,  J.  Trahan,  the  work  was 
supported  by  a  doctoral  student,  S.  T.  Soh,  and  five  M.S.  students:  S.  Ananthakrishnan,  V. 
Narayan,  T.  Smailus,  N.  Venkataramani,  and  D.  Wang. 

2.  Multistage  Interconnection  Networks 

2.1.  RESULTS  -  An  Overview 

Our  aims  in  studying  MINs  were  to  develop  efficient  algorithms  for  the  evaluation  of 
tenninal  reliability  (TR),  broadcast  reliability  (BR),  and  K-terminal  reliability  (KR)  of  MINs. 


These  algorithms  were  to  incorpora’  dependent  and  multimode  failures.  In  the  first  year  of  the 
project,  we  developed  algorithms  for  TR,  BR,  and  KR  of  the  shuffle-exchange  network  with  an 
extra  stage  (SENE)  [2,  9].  These  algorithms  were  efficient,  each  running  within  0(log  N)  time  for 
an  NxN  SENE.  They  assumed  two-mode  and  independent  failures,  however.  Th’s  work  was  an 
essential  first  step  in  the  development  of  more  general  algorithms,  as  the  simplifying  assumptions 
of  two-mode  and  independent  failures  allowed  us  to  concentrate  on  exploiting  the  structural 
properties  of  the  MIN  to  develop  reliability  evaluation  algorithms.  We  expected,  and  later  results 
have  borne  out  the  expectation,  that  these  structure-based  algorithms  would  provide  the  basis  for 
algorithms  under  more  realistic  assumptions.  Later  in  the  first  year,  we  extended  these  algorithms 
to  incorporate  multimode  switch  failures  [18].  The  new  algorithms  ran  to  within  a  constant  factor 
of  the  time  of  the  algorithms  assuming  two-mode  failures,  with  the  exception  of  the  TR  algorithm, 
which  ran  in  0(log  N)  time  as  opposed  to  0(loglog  N)  time. 

In  th2  second  year  of  the  project,  we  enhanced  the  algorithms  to  incorporate  dependent 
failures  [3,  6,  16].  We  also  examined  the  SENE  and  the  merged  delta  network  (MDN)  under 
conditions  that  allowed  link  failures  (10. 17].  In  what  follows,  we  discuss  the  results  in  brief;  the 
details  can  be  obtained  from  the  enclosed  papers. 

2.2.  MULTIMODE  AND  DEPENDENT  FAILURES 

Assumptions  of  two-mode  and  independent  failures  are  common  in  reliability  evaluation, 
since  they  simplify  the  computation  and  since  the  evaluation  problems  remain  intractable  for 
general  networks  even  under  these  assumptions.  These  assumptions,  however,  fail  to  adequately 
model  real-world  situations.  For  example,  researchers  have  described  instances  of  dependent 
failures,  or  fault  side-effects,  in  the  PASM.  Moreover,  the  assumption  of  independent  failures 
leads  to  an  overestimate  of  reliability.  The  two-mode  model  leads  to  an  underestimate  of  reliability 
because  it  does  not  allow  a  degraded  operational  mode  of  SEs.  Note,  some  work  exists  on 
incorporating  multimode  components  or  dependent  failures  into  reliability  analysis  for 
telecommunication  networks  though  little  for  specific  multiprocessor  networks. 
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We  developed  efficient  algorithms  for  TR,  BR,  and  KR  evaluation  of  an  SENE  composed 
of  identical  SEs,  allowing  multimode  and  dependent  failures.  The  SENE  that  we  examined 
contains  2x2  SEs.  In  the  two-mode  case,  an  SE  is  either  completely  working  or  completely  failed. 
In  the  multimode  case,  the  algorithms  assume  a  4-mode  model  of  an  SE;  a  fully  operational  mode, 
a  completely  failed  mode,  and  two  degraded  operational  modes,  namely,  stuck-ai  T  mode  and 
stuck-at  X  mode.  There  exist  16  possible  degraded  modes  for  a  2x2  crossbar  SE.  One  can  readily 
extend  our  approaches  to  incorporate  any  more  or  all  of  the  16  modes.  We  model  stuck-at  faults  (0 
or  1)  and  bridging  faults  between  two  adjacent  links  in  terms  of  switch  failures.  To  incorporate 
multimode  failures,  thereby  extending  the  earlier  reliability  evaluation  algorithms  [2,  9]  to  more 
realistic  assumptions,  we  modified  the  shock  model  of  dependent  failures,  which  is  considered  in 
the  literature  for  general  networks.  For  an  NxN  SENE,  the  algorithms  run  in  time  0(log  N), 
C>(log  N).  and  0{k  log  N),  respectively,  where  k  =  IXI. 

The  shock  model  assumes  that  statistically  independent  shocks,  which  occur  with  known 
probability,  cause  the  failure  of  network  components.  When  a  shock  occurs,  it  causes  the  failure 
of  a  specific  component  or  set  of  components.  We  say  that  the  shock  affects  the  component  or  set 
of  components.  A  shock  affecting  a  single  component  is  called  an  internal  shock  (IS),  while  a 
shock  affecting  multiple  components  is  called  an  external  shock  (ES).  The  main  motivation  for  the 
shock  model  is  that  there  exist  events  that  may  cause  one  or  more  components  Oinks  or  nodes)  in  a 
network  to  fail  simultaneously.  Explicitly,  for  example,  some  components  may  share  important 
equipment  in  common  such  as  a  power  supply,  or  several  components  may  reside  in  a  common 
chip.  Components  will  fail  simultaneously  if  power  fails  or  the  chip  burns.  Test-maintenance- 
operation  errors,  design  defects,  and  electromagnetic  interference  may  also  lead  to  simultaneous 
failures  of  components.  Some  shocks  may  not  induce  failures  of  components  but  may  degrade 
them  so  that  their  joint  probability  of  simultaneous  failure  increases. 

The  shock  model  was  defined  for  two-mode  components.  By  expanding  the  notion  of  an 
IS,  we  generalize  its  application  to  an  SENE  in  which  SEs  may  operate  in  degraded  modes 
Instead  of  associating  a  single  IS  with  a  single  component,  we  associated  one  IS  with  each  failed 
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or  degraded  working  mode  of  a  component.  We  model  a  shock  that  causes  a  stuck-at  fault  in  a 
control  line  by  a  shock  that  affects  the  SE  to  be  stuck-at  T  or  X  mode. 

For  external  shocks,  we  restrict  our  analysis  to  two  modes  only,  so  the  occurrence  of  an 
ES  causes  all  affected  SEs  to  fail.  SEs  that  are  far  apan  are  unlikely  to  be  affected  by  a  single  ES; 
SEs  that  are  adjacent,  however,  are  likely  to  be  affected  by  one  shock.  In  general,  the  classes  of 
ESs  that  we  define  are  motivated  by  the  failure  of  one  SE  causing  the  failure  of  other  SEs  due  to 
the  links  connecting  them.  For  shared-memory  computers,  each  processor  reads  from  or  writes  to 
the  shared-memory  through  a  MIN,  so  communication  flows  in  both  directions.  When  the 
forward  (reverse)  part  of  an  SE  is  failed,  this  SE  may  send  erroneous  routing  and  control 
information  to  either  or  both  SEs  in  the  next  (previous)  stage  that  are  connected  to  it  and  may  cause 
one  or  both  to  fail.  Hence,  we  assume  that  an  external  shock  causes  a  failure  in  adjacent  SEs  either 
in  the  forward  direction  (toward  the  network  outputs)  or  in  the  reverse  direction  (toward  the 
network  inputs).  In  a  practical  design,  a  failure  of  an  output  (input)  controller  can  create  a  forward 
(reverse)  external  shock.  Researchers  have  described  such  dependent  failures  in  the  PASM. 

To  help  compute  the  reliabilities,  we  define  four  classes  of  shocks.  Each  class  of  shock 
will  affect  a  certain  structured  set  of  SEs.  A  Class  1  shock  will  affect  an  SE  to  be  completely 
failed,  stuck-at  T  mode,  or  stuck-at  X  mode.  A  Class  2  shock  will  affect  an  SE  and  one  SE  to 
which  it  is  connected  in  the  previous  or  next  stage  to  be  failed.  A  Class  3out  shock  will  affect  an 
SE  and  the  two  SEs  in  the  next  stage  to  which  it  is  connected  by  its  output  links  to  be  failed.  A 
Class  3in  shock  will  affect  an  SE  and  the  two  SEs  in  the  previous  stage  to  which  it  is  connected  by 
its  input  links  to  be  failed.  We  define  such  shocks  for  every  SE.  Because  the  structures  affected 
by  the  same  class  of  shocks  are  the  same  and  all  SEs  are  identical,  the  probabilities  that  the  same 
class  of  shock  occurs  are  identical  for  all  such  shocks.  The  purpose  of  our  work  is,  then,  to  find 
the  relationship  between  the  probabilities  of  each  class  of  shock  and  the  reliability  of  the  whole 
network. 

In  earlier  work  (2,  9),  we  noted  that  SEN^E  paths  form  a  simple  series-parallel  graph  for  TR 
and  a  pair  of  intersecting  binary  trees  for  BR  and  KR.  To  incorporate  dependent  and  multimode 
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failures,  we  again  followed  this  concept,  but  also  were  required  to  include  a  careful  and  detailed 
accounting  of  the  shocks  that  may  affect  the  SEs  on  the  paths.  Note  that  a  single  shock  may  affect 
one,  two,  or  three  SEs  on  the  relevant  paths. 

As  an  example,  consider  the  computation  of  K-terminal  reliability  in  an  NxN  SENE.  We 

outline  it  simply  as  follows.  Let  £  be  a  Boolean  variable  such  that  £  =  1  if  and  only  if  working 
paths  exist  to  all  outputs  in  set  K  from  input  s.  Then  P(E)  is  the  K-terminal  reliability.  Let  /^,  1^, 
1  ,  and  /  denote  the  events  that  SE  /  is  failed,  stuck-at  T,  stuck-at  X,  and  working,  respectively. 
Using  the  theorem  of  total  probability  and  noting  that  £(£  |  /^)  =  0,  we  have 

KR{K,N,s)  =  P{E\I^)P{l,)  +  PiE\I^)P{l^)^  P{E\1^)P{I^). 

Equation  (1)  helps  us  obtain  the  K-terminal  reliability  by  computing  each  term  using  the  following 
steps. 

Step  1.  Determine  individual  probabilities  PU^,  RUJ,  and 

Step  2.  Compute  conditional  probability  terms  £(£  |  /p,  P{E  \  IJ,  and  £(£  1  /^). 

Step  3.  Obtain  KR{KJ^^)  using  the  results  of  Steps  1  and  2. 

Steps  1  and  3  of  this  procedure  are  straightforward.  Step  2  requires  a  careful  case  analysis 
dependent  on  the  degraded  working  modes  of  the  SEs  and  on  the  composition  of  the  set  K.  For 
K-terminal  reliability,  we  describe  each  SE  on  a  path  from  a  specified  input  s  to  an  output  in  K  as 
marked. 

For  an  input  s  and  a  set  K  of  outputs  in  an  A/xW  SENE,  where  k  =  [XI,  the  equation  below 
computes  the  K-terminal  reliability  in  0{k  log  N)  time.  Let  /y,  p,,  p^,  P2,  P2o<  P3i  denote  the 
probabilities  of  a  Class  1  shock  that  causes  failure  of  an  SE,  a  Class  1  shock  that  causes  an  SE  to 
be  stuck-at  T,  a  Class  1  shock  that  causes  an  SE  to  be  stuck-at  X,  a  Class  2  shock,  a  Class  3out 
shock,  and  a  Class  3in  shock.  Let  Py^,  =  l-(p,  +  +  £/)•  <72  =  ^~P2'  <73o  ~  ^~P3o'  ^'^d  <73,  =  1- 

£3i- 


KRiK,N,s)  =  [{p,  +  p.  - 2p„p^q;l)KR,{K,N,s)  +  p^q;lKR^iK,N,s)]qlqj,ql. 
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Recurrence  expressions  compute  values  for  KRi{K,N,s)  and  KRiiK.N.s)  according  to  the 
different  cases  enumerated  below.  Computing  the  base  case  for  N  =  4  requires  consideration  of 
several  different  cases.  For  the  sake  of  brevity,  we  do  not  enumerate  the  results  for  the  base  case. 
Case  /;  Only  one  child  of  considered  SEs  is  marked. 

1.  If  the  T  mode  in  these  SEs  allows  working  paths  from  input  j  to  the  it  outputs,  then 
KR\{K,N,s)  and  KR2{KJ^,s)  can  be  computed  by 


A^.s)  =  (^.  + ,5  j 


KR2iK,N,s)  = 


(P/  +  +p^)(p^  +  py  +  a  ^p^-\)KRx 


2.  If  the  X  mode  in  these  SEs  allows  working  paths  from  input  s  to  the  k  outputs,  then 

KR\{KJ^,s)  and  KR2(KJ^,s)  can  be  computed  by  the  equations  above,  exchanging  Pj  and 

Px- 

Case  II:  Both  children  of  considered  SEs  are  marked.  Let  Ki  and  K,  be  the  subsets  of  K  that  lie  in 
the  left  and  right  subgraphs,  respectively. 

KR\{KJ^,s)  can  be  computed  by 

KR2iK,N,s)  can  be  computed  by 

1 .  If  input  s  can  access  the  right  (left)  subgraph  when  the  considered  SEs  are  set  in  T  (X)  mode, 


then 
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KR^{K,N,s)  = 


piqJ^KR2^Ki.j.!yK2[Kr-J.s 

*2p,p^qiiKRiU,.^.s]KR2Ur.J.s 


N 


N 


+^(pw{pf  +  a"‘Pw  - 


N 


Pl^toPtr 


2.  If  input  s  can  access  the  left  (right)  subgraph  when  the  considered  SEs  are  set  in  T  (X)  mode, 
then  A'/?2  can  be  computed  by  the  equation  above,  exchanging  p,  and  p^. 

In  summary,  we  used  the  shock  model  to  develop  efficient  algorithms  for  terminal, 
broadcast,  and  K-terminal  reliability  evaluation  of  an  SENE  with  both  dependent  and  multimode 
failures.  All  previous  work  in  this  area  assumed  that  failures  of  SEs  in  MINs  are  independent  and 
that  each  component  has  only  two  possible  modes.  In  the  real  world,  however,  these  assumptions 
are  not  often  true.  Our  algorithms  for  broadcast  and  K-terminal  reliability  evaluation  run  within  a 
constant  factor  of  time  of  the  algorithms  that  we  developed  [2,  9]  for  the  same  problems  under 
assumptions  of  independent  and  two-mode  failures.  Our  algorithm  for  terminal  reliability  runs  in 
0(log  N)  time  as  opposed  to  (9(loglog  N)  time  for  the  earlier  TR  algorithm  that  assumed 
independent  and  two-mode  failures.  Though  we  developed  the  algorithms  for  the  SENE  MIN,  the 
principles  underlying  the  incorporation  of  dependent  failures  into  reliability  evaluation  algorithms 
readily  generalize  to  apply  to  reliability  evaluation  algorithms  for  any  other  regularly  structured 
MIN,  such  as  those  reported  in  the  literature  for  the  generalized  INDRA  network,  merged  delta 
network,  and  augmented  C  network.  Similarly,  though  we  developed  the  algorithms  for  an  SE 
model  that  includes  only  4  of  the  16  possible  modes  of  operation,  degraded  operation,  and  failure, 
the  underlying  principles  readily  generalize  to  incorporate  more  or  all  of  the  possible  modes  by 
simply  including  the  appropriate  cases.  Finally,  though  we  developed  the  algorithms  for  a 
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particular  set  of  dependences  between  failures  of  SEs,  one  can  use  the  shock  model  in  the  same 
fashion  to  incorporate  other  dependences  between  failures. 

2.3.  linkfae,ures 

Most  studies  carried  out  to  evaluate  the  reliability  measures  of  a  MIN  have  assumed  that 
only  nodes  are  susceptible  to  failure  or  accounted  for  tl.e  failure  of  links  by  considering  a  link  as 
part  of  the  adjacent  switching  element.  Neither  approach  accurately  accounts  for  link  failures,  even 
though  the  failure  of  even  a  single  link  disconnects  several  input/output  paths,  leading  to  a  lack  of 
fault  tolerance  and  low  reliability.  We  have  designed  simple  and  efficient  algorithms  for  the  TR 
and  BR  evaluation  of  the  SENE  and  the  merged  delta  network  (MDN)  under  the  assumption  that 
both  nodes  and  links  can  fail  [10,  17].  We  assumed  two-mode  and  independent  failures.  This 
extension  to  consider  link  failures  is  in  line  with  our  other  work  on  incorporating  more  realistic 
assumptions  into  the  reliability  evaluation  of  the  regularly  structured  networks  in  MINs  and 
hypercubes. 

For  the  SENE,  the  TR  and  BR  evaluation  algorithms  run  in  0(loglog  N)  and  0(log  AO  time 
respectively.  For  the  MDN,  the  algorithms  each  run  in  0(log  AO  time.  These  times  are  within  a 
constant  factor  of  the  times  required  for  evaluation  of  the  corresponding  expressions  that  we 
developed  [2,  9]  for  the  SENE  and  the  ones  reponed  for  the  MDN,  though  these  earlier 
expressions  assumed  that  only  SEs  may  fail  and  links  are  always  working.  Hence  we  can 
conclude  that  incorporating  both  node  and  link  failures  into  the  reliability  analysis  for  the  SENT 
and  MDN  does  not  add  any  extra  overhead  to  the  computation  time. 

3.  Hypercubes 

3.1.  Results  -  An  Overview 

In  the  first  year  of  the  project,  we  explored  the  hypercube  reliability  problem  with 
deterministic  and  probabilistic  models.  A  deterministic  model  assumes  a  given  set  of  failures  in  a 
cube.  Specifically,  the  problem  was  to  determine  the  size  and  location  of  the  maximal  dimension 
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available  (fault-free)  subcube.  We  denote  this  as  the  reconfiguration  problem.  We  developed  a  set 
of  algorithms,  each  based  on  a  different  representation  of  a  hyjjercube,  to  solve  the  reconfiguration 
problem.  Additionally,  we  extended  the  concepts  developed  in  the  reconfiguration  algorithms  to 
address  the  problem  of  dynamic  allocation  of  subcubes  of  a  hypercube  to  multiple  tasks. 

Jh-obabilistic  fault  tolerance  measures  for  hypercube  multiproct  ors  are  useful  for  packet¬ 
switching  applications  because  they  verify  the  sturdiness  of  the  topology  and  depict  the  probability 
of  successful  flooding  (for  route  set  up  or  packet  transmission).  To  study  the  probabilistic  model, 
we  looked  into  terminal  and  network  reliability  evaluations  using  CAREL,  a  tool  used  to  compute 
general  network  survivability  measures.  (The  Year  1  report  in  Appendix  A  describes  CAREL.) 
Because  of  the  exponentially  large  number  of  paths  (spanning  trees)  involved  as  a  starting  step  for 
solving  the  terminal  (network)  reliability  problem,  even  the  efficient  general  algorithms  in  CAREL 
failed  to  generate  results  for  hypercubes  of  dimension  /i  >  4  in  reasonable  time  [13]. 

In  the  second  year  of  the  project,  we  continued  our  effons  to  discover  efficient  approaches 
to  solve  reliability  problems  in  a  hypercube  architecture.  For  the  deterministic  model,  we  solved  the 
following  specific  problems: 

(a)  In  a  multiuser-multitasking  environment,  subcube  allocation  plays  an  important  role.  We 
obtained  an  efficient  distributed  algorithm  for  largest  operational  subcube  identification. 

(b)  The  K-Connected  Functionality  (KCF)  problem  applies  to  large  scale  degradable  hypercubes 
used  to  run  concurrent  algorithms  that  are  not  sensitive  to  changes  in  the  system  topology.  We 
established  a  bound  on  the  number  of  faulty  nodes  that  an  n-dimensional  hypercube,  Cn,  can 
tolerate  such  that  at  least  K  processors  remain  connected,  provided  there  are  no  Co  or  Cj 
disconnections. 

(c)  We  constructed  a  fault  tolerant  broadcasting  algorithm  useful  for  distributed  agreement  and 
clock  synchronization. 

Section  3.2  discusses  this  work  in  greater  detail. 

In  addition  to  broadening  our  exploration  towards  deterministic  models,  we  developed 
methods  in  the  probabilistic  model  for  approximating  network  and  terminal  reliabilities  using  lower 


and  upper  bounds,  between  which  the  exact  measure  is  guaranteed  to  exist.  Section  3.3  describes 
this  work. 

3.2.  DETERMINISTIC  Model 

In  the  first  year  of  the  project,  we  obtained  a  centralized  algorithm  for  reconfiguration  of  a 
hypercube  after  faults  [5].  In  the  second  year  of  the  project,  we  improved  on  this  algorithm  [  1 J. 

A  related  and  equally  important  issue  is  that  of  largest  operational  subcube  (LOS) 
identification.  This  year,  we  proposed  a  distributed  LOS  algorithm  [13].  A  distributed  algorithm 
allows  each  PE  v  to  run  a  program  to  obtain  a  subcube  with  maximum  size  k  that  contains  v.  Our 
method  uses  the  CMB  operator  of  CAREL  which  includes  the  multiple  variable  inversion  concept. 
The  computational  complexity  of  the  CMB  operator  is  data  dependent.  Hence,  we  consider  the 
best  and  worst  cases  of  the  LOS  approach.  We  proved  that  LOS  takes  Oim^)  steps  for  the  best 
case  while  O(m^)  for  the  worst  case,  where  m  <  n  denotes  the  number  of  faulty  nodes  in  a 
hypercube  C„.  A  heuristic,  based  on  a  modification  of  the  LOS  approach,  however,  generates 
correct  results  in  time  0{rn^)  for  99%  of  the  problem  instances  tested.  In  case  the  number  of  non- 
available  nodes  (faulty  or  busy)  increases,  an  alternative  distributed  approach  processes  w  available 
nodes  in  Oiwn)  time  to  solve  the  LOS  problem  [13]. 

The  KCF  model  uses  the  concept  of  forbidden  faults  and  considers  as  forbidden  fault  sets 
that  cause  disconnection  of  a  working  0-subcube,  Co-  Researchers  have  shown  that  a  Cn  can 
tolerate  up  to  2«-3  faulty  nodes  and  remain  connected  provided  that  the  failures  do  not  disconnect 
any  Cq  subcube.  Our  work  further  generalizes  the  KCF  connectedness  measure  by  extending  the 

forbidden  set  to  include  Ci  disconnections.  We  established  that  a  C„  can  tolerate  up  to  min  { 3n-5, 
4rt-9}  faulty  nodes  and  remain  connected  if  disconnections  of  Ci  or  its  subsets  do  not  occur.  This 
assumption  is  not  impractical  as  researchers  have  studied  the  probabilities  of  C,  disconnection  and 
have  shown  that  for  Ci  it  is  very  low. 

We  adopted  a  hybrid  approach  to  fault-tolerant  broadcasting  that  uses  the  concepts  of 
redundant  and  non-redundant  methods.  It,  thus,  avoids  faulty  PEs  in  the  communication  paths  to 
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improve  the  capability  of  an  existing  redundant  type  algorithm.  Here,  each  PE  sends  the  message 
only  to  its  healthy  neighbors.  Furthermore,  the  algorithm  modifies  the  message  reception 
mechanisni  to  recognize  only  the  first  arriving  copy  of  the  message  and  to  ignore  later  redundant 
copies  [13]. 

3.3.  PROBABILISTIC  Model 

Our  work  on  the  probabilistic  model  concentrated  mainly  towards  generating  lower  bounds 
for  network  and  terminal  reliabilities.  A  lower  bound  on  reliability  is  quite  appealing  because  it  can 
be  obtained  with  substantially  less  computation  than  an  exact  bound,  and  the  system  will  be  at  least 
as  reliable  as  the  bound.  Towards  obtaining  bounds  on  network  reliability,  we  first  solved  Spemer 
bounds  and  Kruskal-Katona  bounds  for  hypercubes.  These  solutions,  however,  required 
unreasonable  computation  time  for  cubes  of  dimension  greater  than  3.  We  next  attempted  exact 
reliability  evaluation  using  spanning  trees  and  CAREL.  An  n-cube  is  a  matroid  and  CAREL  solves 
reliability  problems  in  time  polynomial  in  the  number  of  spanning  trees,  ST(C„).  Unfonunately, 
ST(C„)  is  exponential  with  respect  to  n.  Thus,  this  technique  is  not  advisable  for  large  n.  As  an 
alternative  to  these  solution  approaches,  we  derived  a  tighter  lower  bound  on  NR  using  structural 
propenies  of  the  hypercube  [4,  8,  13].  The  NR  bounding  algorithm  uses  the  structure  of  a  C„to 
recursively  generate  a  lower  bound  from  the  knowledge  of  the  lower  bound  for  a  C„_i.  One  can 
partition  a  C„  into  two  C„_i ’s.  We  define  an  exterior  link  as  a  link  between  these  two  ’s.  The 
algorithm  divides  the  problem  into  three  mutually  disjoint  cases  such  that  events  in  each  case  are 
also  mutually  disjoint  among  themselves.  Thus,  the  lower  bound  on  NR ’s  the  sum  of  the  lower 
bounds  on  NR  obtained  from  the  following  three  cases. 

Case  1 .  Both  (/i-l  )-cubes  and  only  one  exterior  link  operate. 

Case  2.  All  2”"'  exterior  links  operate. 

Case  3.  For  2<i  <  2"~'-l,  i  exterior  links  and  one  (n-1)  cube  operate. 


13 


Case  3  is  further  subdivided  for  i  =  2”  *-l  and  i  =  2"  *-2  to  help  solve  large  problems 
efficiently.  Note,  the  contribution  to  NR  from  the  Case  3  terms  where  2<i  <  2"~*-3  is  very  small 
and  so  is  computed  by  one  expression  for  all  such  terms. 

The  terminal  reliability  bounding  algorithm  for  a  source  s  and  terminal  t  at  distance  n  from 
each  other  considers  only  shonest  paths  from  j  to  r  (4,  8,  13].  (If  we  consider  only  shortest  paths, 
then,  without  loss  of  generality,  we  may  consider  these  as  nodes  0  and  2"-l  in  a  C„.)  Of  the  n\ 
(s,t)  paths,  n  of  these  are  node  and  link  disjoint.  We  efficiently  computed  a  lower  bound  on  TR 
from  a  specific  set  of  <x  n  paths,  where  a  represents  a  multiplier.  A  routing  algorithm  determines 
the  parameter  a.  We  show  that: 

(a)  a  =  r (n/2)-ll  if  the  following  properties  are  satisfied. 

Property  N1 :  Paths  P,j  and  Pij  are  node  and  link  disjoint,  for  j  ^  1. 

Property  N2:  Paths  Pij  and  P^j  have  (f+l)  common  nodes  and  links  for  i  <  k. 

Property  N3:  Paths  Pij  and  P*,/  are  node  and  link  disjoint  for  j  ^  1. 

(b)  a  =  in-2)  if  the  following  properties  are  satisfied. 

Property  LI ;  Paths  P,j  and  P,,/  are  link  disjoint  for  j  *  1. 

Property  L2:  Paths  P,j  and  Pkj  have  (i+l)  common  links,  for  i  <  k. 

Property  L3:  Paths  P(^  and  P^  are  link  disjoint,  for  j  ^  /. 

In  (a)  and  (b),  let  1  <i,k<a  and  1  <j,l<n.  The  algorithm  exploits  properties  N1  through 

N3  and  LI  through  L3  to  generate  TR  bounds  for  node,  link,  and  node  and  link  failure  cases.  We 

proved  that  our  TR  bound  is  tighter  than  previously  established  bounds  based  on  disjoint  shortest 
paths.  To  improve  the  bounds,  we  have  also  constructed  a  2-cube  model  for  bounding  TR  of  a 
Cn-  A  combination  of  the  Boolean  and  2-cube  approaches  leads  to  a  still  tighter  bound  on  TR. 

Note  that  the  fault  model  above  allows  both  node  and  link  failures.  The  inclusion  of  link 
failures  implicitly  considers  multimode  failures  of  processors  as  follows.  A  processor  connects  to 
a  link  through  an  I/O  port  and  associated  control  circuitry.  For  example,  the  functional  block  of  a 
node  in  the  Intel  iPSC  architecture  comprises  64K  bytes  EPROM,  512K  bytes  DRAM,  an  Intel 
80286  processor,  and  an  Intel  Ethernet  Controller  82586  for  I/O.  The  failure  of  a  port  or  its 


controller  is  actually  a  paniai  failure  of  a  processor,  resulting  in  a  degraded  operational  nuxie  for 
the  processor.  We  model  this  multimode  failure  of  a  processor  as  a  two-mode  failure  of  a  link. 

4.  General  Results 

In  addition,  we  computed  the  node  and  link  failure  probabilities  in  a  typical  hypercube 
system  [13].  To  illustrate  this,  assume  an  Intel  iPSC  architecture.  Based  on  the  functional  block 
described  above  for  its  node,  we  utilized  MIL-HDBK-217F  to  determine  rele  .  ant  parameters  for 
the  chips  and  computed  mean-iime-to-failure  rates.  A  link,  in  this  case,  includes  an  I/O  unit  at  a 
node,  physical  communication  media,  and  the  I/O  unit  of  the  adjacent  node.  We  have  also 
expanded  our  effort  towards  the  understanding  of  an  object-oriented  approach  to  evaluate  fault 
trees  with  dynamic  and  non-dynamic  gates.  HARP,  a  typical  fault  tree  solver,  uses  these  gate 
types  to  model  the  behavior  of  a  fault  tolerant  hypercube.  We  used  an  object-oriented  approach  to 
solve  fault  trees  directly  [12,15].  This  effon  is  different  from  the  one  given  in  HARP  where  an 
indirect  approach  is  utilized  and  the  fault  tree  is  inherently  translated  into  a  Markov  model.  All 
these  efforts  are  helpful  in  understanding  different  aspects  of  the  hypercube  reliability  evaluation 
problem. 

5.  Future  Work 

Regarding  MINs,  the  most  significant  open  problem  related  to  our  work  is  to  obtain 
efficient  algorithms  for  evaluating  the  Network  Reliability  (NR)  of  MINs.  We  expect  that  efficient 
algorithms  exist  because  of  the  regular  structure  of  MINs  and  because  of  results  on  other  reliability 
problems.  Following  the  principles  we  have  established,  an  efficient  algorithm  for  NR  under 
assumptions  of  independent  failures  and  two  mode  components  should  be  able  to  be  readily 
generalized  to  an  algorithm  under  assumptions  of  dependent  failures  and  multimode  components. 

Regarding  hypercubes,  the  most  significant  open  problem  related  to  our  work  is  to  obtain 
reliability  evaluation  algorithms  that  incorporate  dependence  between  failures.  As  for  MI.N's.  we 


believe  that  the  shock  mode!  can  provide  a  basis  for  incorporating  dependent  failures  into  reliability 
evaluation  algorithms. 
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aBSTTRACT 

The  hypcrcube  irchiieciure  >t  a  popular  topology  for  many  parallel  proceistng 
appticauonr.  For  conunued  operauon  of  the  hypcrcube  muluprtxxjton  after  the  failure  of  one 
or  more  i-iubcubet  and/or  links,  fault  tolerance  by  reconfiguration  is  an  imporunt  problem. 

This  paper  considers  the  teconrigurauon  issue  and  presents  an  algebraic  technique  to  analyze  the 
prxiblem.  extending  the  concepts  m  (9|.  The  technique  uses  algebraic  operators  to  identify  the 
maximum  dimensional  fault-free  subcube,  and  thus  helps  in  achievuig  graceful  degradauon  of 
the  system.  We  analyze  the  complexity  of  our  algorithm  and  show  that  it  is  efficient  u 
compared  to  previous  algorithms  (1.  8.  I4|. 

ATeywordi  hypcrcube  reconfiguration,  subcube  recognition,  fault  free  subcubes. 

1.  Introduction 

The  suitability  of  a  multiprocessing  architecture  is  largely  affected  by  its  ability  to 
tolerate  faults.  After  identification  of  faulty  elements,  reconfiguration  of  the 
multiprocessor  and  the  disu-ibuted  algorithm  running  on  the  multiprocessor  allows 
graceful  degradation.  Reconfiguration  ensures  continued  operation  of  a  hypcrcube 
multiprocessor  after  the  failure  of  one  or  more  subcubes  (a  subcubc  is  a  subgraph  of  a 
hypercube  that  preserves  the  properties  of  the  hypercube)  and/or  links.  Algorithms  exist 
for  diagnosing  faulty  processors  and  links  in  hypercubes  [2,  3).  Fortunately,  most 
parallel  algonthms  can  be  formulated  with  the  dimension  n  of  the  hypercube  being  a 
parameter  of  the  algorithm  [1).  Hence,  the  reconfiguration  problem  in  a  hypcrcube 
multiprocessor  reduces  to  identifying  the  maximum  dimensional  fault-free  subcube(s). 

Becker  and  Simon  [1]  provided  a  procedure  that  usually,  but  not  always,  finds  the 
maximum  dimension  “(T  of  a  fault-free  subcube.  Ozgiiner  and  Aykanat  [8]  utilized  the 
principle  of  inclusion-exclusion  in  algorithms  that  always  find  d  and  also  the  number  of 
fault-free  d-subcubes  or  the  complete  set  of  fault-free  d-subcubes.  Kim  et  al.  (5)  presented 
a  top-down  processor  allocation  suategy  that  also  applies  to  the  reconfiguration  problem. 
Sridhar  and  Raghavendra  [14]  gave  an  algorithm  for  reconfiguration  based  on  assigning 
weights  to  nodes  and  identifying  subcubes  based  on  their  maximum  and  minimum  weight 
nodes.  Laiifi  [6]  gave  a  disuibuted  algorithm  that  follows  a  greedy  heuristic  for 
identifying  the  largest  fault-free  subcube  in  a  faulty  hypcrcube.  Most  of  these  techniques 
treat  a  link  as  a  1-subcube.  When  the  end  nodes  of  a  link  are  not  faulty,  uoaiing  a  link  as 
equivalent  to  a  1  -subcube  is  erroneous. 

This  paper  introduces  a  new  algebraic  technique  that  returns  a  list  of  fault-free 
subcubes  for  the  purpose  of  rcconfiguratjon  after  faults  m  a  hypercube.  Our  technique  has 
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the  advaniaijcs  of  simplicity  and  improved  time  complexity  over  previous  exact  methods. 
We  present  four  operators,  namely  #  (sharp).  S  (dollar).  D.  and  pi  to  help  desenbe  our 
method.  The  proposed  technique  is  formulated  to  run  on  a  single  processor  which  would 
typically  be  the  host  or  the  resource  manager  in  a  commercial  hypercubc  system. 

The  layout  of  the  paper  is  as  follows.  Section  2  discusses  the  hypercubc  and  its 
properties  and  presents  a  fault  model  that  allows  subcube  and  link  failures.  The  algebraic 
operators  arc  given  in  Sccuon  3.  Section  4  describes  the  algorithms  and  illustrates  the 
technique  with  examples.  The  complexity  issues  presented  in  Section  5  show  that  our 
method  is  more  efficient  than  previous  approaches. 

2.  Hypercube  Concepts  and  Fault  Models 

An  n-dimcnsional  hypercubc  is  defined  as  Q„  =  K2  x  ,  where  K2  is  the  complete 
graph  with  tv/o  nodes.  Qg  is  a  trivial  graph  with  one  node  and  x  is  the  product  operation 

on  two  graphs  [4],  Let  Q„  be  modeled  as  a  graph  CfV,  £)  with  IVl  =  2”  and  1£1  =  /i  2^'^ . 
The  graph  E)  is  both  node  and  link  symmetric.  Each  node  in  Ci  C)  represents  a 
processor  and  each  edge  represents  a  link  between  a  pair  of  processors.  Assign  binary 
numbers  from  0  to  (2"-l)  to  nodes  such  that  addresses  of  any  two  adjacent  nodes  differ  in 
only  one  bit  position.  The  reader  is  suggested  to  refer  to  [4,  11]  for  other  iniercsung 
propenies  of  a  hypercubc  graph. 

Using  an  n-tuple,  a  processor  in  Q„  is  denoted  by  h„.i...h,...ho.  where  h,  e  {0,  1). 

Two  adjacent  nodes  which  differ  in  the  Hh  bit  are  said  to  be  in  diieciion  i  (0  S 1  S  «-l) 
with  respect  to  each  other.  A  subcube  in  a  hypercube  Q„  is  a  subset  of  a  hypercubc  that 

preserves  the  propenies  of  a  hypercubc.  It  is  represented  by  an  n-iuple  {0,l,x)". 
Coordinate  values  “O”  and  “1"  can  be  referred  to  as  fixed  or  bound  coordinates  and  “x"  as 
free.  An  i-dimensional  cube  (or  i-subcube.  Q^)  in  Q„  has  (n-i)  bound  coordinates  and  i 

free  coordinates.  Note  that  we  will  use  the  terms  node  and  0-subcube  interchangeably 
throughout  the  paper  since  they  denote  the  same  object.  We  describe  a  link  with  an  n- 
tuple  {0,1, q]'*  containing  exactly  one  q.  The  position  of  the  coordinate  q  in  one  of  the  n 
coordinate  positions  indicates  the  adjacency  direction  for  the  end  nodes.  For  example, 
lOOq  denotes  the  link  with  end  nodes  10()0  and  1001  in  Q4.  Note  that  both  x  and  q 
cannot  be  present  in  the  n-tuple  representation  of  a  Q,.  This  notation  differentiates  a  link 
from  a  l-subcube.  We  refer  to  the  node,  subcube,  and  link  notation  described  above  as 
ternary  vector  (TV)  notation. 

When  an  i-subcube  is  faulty,  we  assume  that  all  2*  nodes  forming  the  i-subcube 
along  with  their  interconnecting  links  are  unavailable.  Node  failure  is  a  special  case  of  an 
i-subcube  fault,  where  i  =  0.  We  assume  that  a  node  failure  removes  the  node  and  all 
incident  links  from  the  graph,  A  link  failure  has  the  effect  of  deleting  the  particular  link 
from  C(V,  £). 

Note  that  a  link  and/or  node  may  be  faulty  due  to  a  hardware  failure.  When  some 
task  is  currently  being  executed  on  an  i-subcubc,  the  said  i-subcube  is  temporarily 
unavailable  and  may  also  be  considered  as  faulty  fron  the  viewpoint  of  reconfiguring  the 
multiprocessor  to  run  an  additional  task. 


3.  Algebraic  Operators 

This  section  defines  the  algebraic  operators.  S,  D,  and  pi,  that  we  will  use  to  find 
maximum  dimension  non-faulty  subcubes  in  the  presence  of  subcube  and  link  failures. 
Each  operator  “o”  works  on  pairs  of  n-tuples  {0.  1,  xj”  or  (0.  1.  q}”.  Each  definition 
begins  with  a  table  desenbing  the  "o”  operation  on  each  pair  of  corresponding  elements 
from  the  /j-tuples,  then  uses  this  coordinatewise  definition  to  define  the  “o"  operation  on  a 
pair  of  n-tuplcs.  In  what  follows,  Cf,  describes  a  working  subcube,  while /,  represents  a 

(failed)  subcube  or  link.  Algorithm  I  uses  the  #  and  S  operators  to  produce  a  set  of 
maximal  size  subcubes  contained  in  C/,  and  disjoint  from  /,.  Algorithm  2  operates  on 

faults  using  the  £)-operator,  and  uses  the  pi  operator  and  equality  checking  to  remove 
redundant  terms  from  the  computation  of  the  fault-free  subcubes. 

Definition  1:  n  operator.  Let  cj,  =  a„.i  ...  flj ...  ap  fs  -  ^n-l  —  —  ^0-  ''^hcre 

a,-  €  (0,l,x}  and  b,  e  {0,l,x),  where  the  fault  type  is  a  subcube  failure.  Table  1  defines 
the  coordinate  #  operation. 

The  following  equation  defines  #  operation  between  cj,  and  f. 

C/,  :  if  a,  #  b,  =  y  for  any  i 

c*#/,  =  '0  ;  ifa,#b,  =  2  for  all  i 

U  a„_i...a,,.ia,a,.i...ao  ;  otherwise,  where 

!€/>  /*  =  {i  I  a,  #  b,  =  ttj  =  0  or  1} 

r 

If  C  =  [cj . c^]  is  a  set  of  /i-tuplcs.  then  let  C  # /,.  Miller  [7] 

described  the  sharp  (#)  operator  and  its  propenies.  The  #  operator  is  quite  general  and  a 
modification  to  it  finds  use  in  PLA  testing  [10]  and  reliability  compuution  of  general 
networks  (13.  15].  Note  that  in  the  coordinate  #  operation,  y  denotes  that  the  cubes  are 
disjoint  and  z  indicates  a  possible  overlap. 

Definition  2:  S  operator.  Let  Cf^  ~  a„.i  ...  a, ...  ap  and  f  -  b„,i  ...  b,- ...  bp,  where 
a,-  €  (0,1, x)  and  b,  e  (0,1, q),  where  the  fault  type  is  a  link  failure.  Table  2  defines  the 
coordinate  S  operator. 

Define  the  S  operator  between  c/,  and as  follows.  Let  Cj,  $/,  =  c^,  if  a,-  Sb;-  y 
for  any  /;  else,  let  c^,  S/^  =  X  u  K  u  Z,  where  for  some  j,  oj  Sbj=u  and 

0  ;  if  Oj  Sbj  =  r  for  some  j  and  a,  Sb,  =  i  for  all  i  *  j, 

X  =  '|  U  •  otherwise,  where 

i^P  P  =  {i  I  a;Sb,  =  a,  =  0  or  l|, 

y  =  a^.j  ...  0  ...  ap,  and  Z  =  a„.,  ...  a^+j  I  ...  ap.  (2) 

r 

If  C  =  (ci,...,c,)  is  a  set  of  cubes,  then  let  C  S/,  =  Cf,  S  f.  Note  that  in  the 

coordinate  S  operation,  y  indicates  that  the  cube  and  link  are  disjoint,  z  indicates  a 
possible  overlap,  and  t  in  dimension  j  indicates  that  the  faulty  link  is  in  dimension  j  and 
the  subcube  contains  links  in  dimension  j. 


Example  J.  Consider  a  faulty  link  OOq  i-fi)  in  Q3.  For  Cj  =  xxx,  we  obtain  I  li  by 
the  coordinate  S  operation.  From  Equation  (2).  the  list  of  fault-free  subcubes  is  S/j  * 
{Ixx,  xlx.  xxl.  xxO).  The  first  two  values  are  all  maximal  subcubes  in  Cj  disjoint  from 
OOx,  the  1-subcube  containing  link  OOq;  the  next  two  values  correspond  to  split  along 
the  direction  of  link  OOq. 

Defininon  3.  (a)  Let /j  =  b^,i  ...  bi ...  69  be  the  TV  description  of  a  subcube. 


Then  D(f^)  =  xx  ...  x  a,  x  ...  xx,  where  =  {/I  a,  =  0  or  1). 

(b)  Let /j  =  b„.i  ...  bi  ...  bo  description  of  a  link,  where  bj  =  q.  Then 

D(f^)  =  X  Y  u  Z,  where  X  =  xx  ...  x  x  ...  xx,  where  /*  =  [i\i*j],Y  =  xx  ... 

xOx  ...  XX,  and  Z  =  xx  ...  xlx  ...  xx.  where  the 0  and  1  are  in  position;. 

For  example,  Z)(101x)  =  (Oxxx,  xlxx.  xxOx)  and  D(lOql)  =  (Oxxx,  xlxx,  xxOx, 
xxlx.  xxxO; .  Set  D(f^)  contains  all  maximal  subcubes  disjoint  from 

Definition  4.  pi  operation.  Let  c>,  and denote  subcubes,  where  c/,  *  a„.  j  ...  a, ... 
rjQ  and/,  =  ...  b, ...  bQ.  where  a,  6  (O.ljt)  and  b^  6  {0,1  jt).  The  coordinate  pi 

operation  is  defined  in  Table  3. 

Define  iie  pi  operation  between  and /,  as  follows. 

J0  ;if  a,-  pi  bi  ~  y  for  any  i,  liiin 

f  I  .J  _ j4  A  .  M  V  i,.  T/w  \  ^  i  ^  m 


dido  ; otherwise,  where d,  =  a, />/b,-.  for  1  Si  Srt 
Note  that  pi  returns  the  subcube  common  to  c/,  and /,.  If  c>,  and/,  are  disjoint,  the 
coordinate  nl  operation  returns  a  y  in  the  bit  position  for  which  Cf^  has  a  1  and  /,  has  a  0 
(or  vice  ve:  ja).  and  the  pi  operation  returns  the  null  set.  _ 


fs  0  1 


1  y  1 

X  I  0  1 

Table  3.  Coordinate  pi  operation 


Examcles  2a.  2b,  and  2c  given  below,  illustrate  the  pi  operation  and  the  use  of  it  for 
redundancy  (duplication  and  absorption)  checking.  Assume  the  reference  subcube  as  X. 


Example  (2a) 

(X)  1  1  X  X  0 
(K)  1  I  X  X  0 


Example  {2  b) 

(X)  1  1  X  X  0 

(Z)  11x00 


Example{2c) 

(X)  1  I  X  X  0 
(WO  1  X  X  1  0 


pi:  (i4)  1  1  X  X  0  p/;  (fl)  1  1  X  0  0  p/:  (O  1  I  *  1  0 

Observe  in  Example  2a  ihai  cube  A  is  equal  to  cube  Y.  Thus,  cube  T  is  a  subcube  of 
cube  X  and  is  redundant.  Example  2b  is  similar.  Example  2a  is  a  case  in  which  Y  is 
identical  to  X  and  Example  2b  is  a  case  in  which  Z  is  a  proper  subcube  of  X.  Observe  in 
Example  2c  that  cube  C  is  not  equal  to  either  cube  W  or  cube  X.  Recall  that  pi  produces 
the  subcube  that  is  contained  in  both  X  and  IV.  Thus,  cube  IV  is  not  a  subcube  of  cube 
X.  and  vice  versa,  so  neither  cube  is  redundant. 

4.  Recondguratiun  Algurithms 

In  this  section,  we  present  two  reconfiguration  algorithms  based  on  the  operators 
discussed  in  Section  3.  Algorithm  2  identifies  and  removes  the  redundant  terms  generated 
in  Algorithm  1.  For  both  algorithms,  C,>j  represents  the  set  of  fault-free  subcubes  after 

the  ith  fault. 

Algorithm  1.  [Identification  of  fault-free  subcubes] 

Input:  List  of  subcube  and/or  link  faults /i,/2 . 

C\  =  (ci),  where  Ci  =  x  x  x  ...  x.  /•  Q„  is  initially  fault-free.  •/ 

for  i  -  1  to  m  do 
begin 

if/)  represents  a  subcube  failure 
then  = 

else  C,+i  =5  C,  S/,  /•/,  represenu  a  link  failure  •/ 

if  C,vi  ®  0.  ihen  return  0 

end 

Return  I- 

Theorem  1:  Given  a  list  of  faulty  subcubes  and  links  in  an  /i-dimensional  hypercube. 
Algorithm  I  identifies  ail  maximal  dimension  fault-free  subcubes. 

Proof:  Let  H  be  the  fault-free  portion  of  a  faulty  n-cube.  Let  C„  be  the  set  of  all 

maximal  fault-free  subcubes  in  H.  Now  consider  a  faulty  subcube  or  link /in  Z.  Let  E' 
denote  E  with  /  removed.  We  establish  that,  given  C„,  Algorithm  1  produces  the  set 
of  all  maximal  fault-free  subcubes  in  E'.  This  will  prove  that  Algorithm  1  applied 
to  a  succession  of  faults  in  on  /i-cube  in  turn  will  produce  the  set  of  all  maximal  fault-free 
subcubes  of  the  n-cube,  since  Algorithm  1  begins  with  a  fault-free  n-cube. 

Consider  a  particular  subcubc  c  =  ...  ...  oq  of  C„,  where  a,  e  (0,l,x).  We 

will  show  that  given  a  new  fault /,  Algorithm  I  returns  the  set  c'  of  maximal  fault-free 
subcubes  in  c.  that  is,  the  set  of  all  maximal  subcubes  in  c  that  do  not  contain  /.  We 
describe  the  case  in  which  /  is  a  subcube:  the  case  in  which  /  is  a  link  is  handled 
similarly. 


Let/=  b^.i  ...  6;  ...  i?o  be  a  faulty  subcube,  where  6,-  e  iO.l.x),  Algorithm  1 

computes  the  set  c  -  c  n  f.  Algonihm  1  computes  the  coordinate  »  operauon  between  c 
and /.  If  a,  #  b,  =  y  for  any  i.  then  c  contains  a  1  (0)  and/contains  a  0  (1)  in  coordinate 
position  i.  so  they  arc  disjoint  and  c  =  (c  j .  If  a,  #  b;  =  z  for  all  i.  then  cither  a,  *  b,  or 
bj  5»  X  for  each  i.  Consequently,  c  is  conuined  completely  within  /,  so  c'  =  0. 

Otherwise,  c  and / overlap,  but  neither  contains  the  other.  In  these  coses,  c'  =  U  a^,i 

...  a, VI  a,  a,.]  ...  oq,  where  =  (i  laj  #  b,-  =  a,  =  0  or  1).  For  fixed  i. a,  #  b,  =  0  (1)  if 
a,-  *  X  and  b,  =  1  (0).  The  cube  c’  =  a„.,  ...  a,vi  o,  aj.i  ...  oq  =  a^.j  ...  a,vi  bj  a.-.j 
...  oq  is  clearly  disjoint  with/ and  conuined  in  c.  Cube  c,'  is  one  dimension  smaller  than 
c  and  so  is  maximal.  All  other  subcubes  of  c  cither  overbp  with /or  are  contained  within 
an  element  of  c'. 

The  argument  above  establishes  that  Algorithm  1  produces  the  set  of  all  maximal 
fault-free  subcubes  within  each  individual  element  of  C„.  This  set,  over  all  elements  of 
C^,  forms  C^.^1.  It  is  clear  that  since  E  contains  all  maximal  fault-free  subcubes  before 
fault /,  then  no  maximal  fault-free  subcubes  in  £'  are  excluded  from  Therefore. 

Algorithm  1  produces  all  maximal  fault-free  subcubes  within  a  faulty  /i-cube.  I 

Algorithm  1  may  produce  redundant  terms.  In  what  follows,  we  describe  a  general 
approach  to  produce  only  nonredundant  terms  using  the  f>  and  pi  operators.  Shier  and 
Whited  [12]  confronted  a  similar  problem  while  generating  cutsets  hrom  paihsets  by  a 
process  called  inversion.  We  introduce  the  following  procedure  that,  given  a  set£  >  {£i, 

£2....,  £p)  of  fault-free  subcubes  and  the  set  e  s  {eu^2 . fr)  of  sJl  maximal  subcubes 

disjoint  from  a  new  fault,  produces  the  new  set  of  fault-free  subcubes  and  eliminates 
redundant  terms  in  the  result 

Procedure  FindJubcubesiE,  e) 
begin 

RL\  =0;£Z.2  =  0 

For  each  pair  (£y,  e,},  where  1  Sy  S p  and  1  S  i  S  r,  do  the  following 
Z  =  Ej  pi  e, 

if  Z  *  0,  then  go  to  the  next  (i.y^  pair. 

if  Z  *  0  then  do 

begin 

if  Z  =  e,,  then  K;  =  0 
if  Z  =  Ej,  then  K,  =  0 
if  y,  =  0  and  =  0 
then  e  =  e  -  er.  E  =  E  -  Ej 

RL\  =  RLl  Z.  and  go  to  the  next  {i.J)  pair, 
else  if  y,  (y^)  =  0  and  Yj  {Yi)  *  0 
then  e  -  e  ~  ei{E  =  E  -  Ej) 

RLl  =  RLl  u  Z,  and  go  to  the  next  (i,;)  pair. 


else  if  y,  *  0  and  Kj  0 
then  RL2  =  RL2uZ 


end 

For  each  pair  (a,  p),  such  dial  a  €  /?I1.  p  e  RL2,  do 
begin 

perform  redundancy  checking  on  (a,  3);  refer  to  Example  2 

if  P  is  redundant,  then  RL2  =  RL2  -  P 

end 

For  each  pair  (a,  P).  such  thai  a,  P  €  RL2,  do 
begin 

perform  redundancy  checking  on  (a,  p);  refer  to  Example  2 
if  o  is  redundant,  then  RL2  ~  RL2  -  a 
if  3  is  redundant,  then  RL2  =  RL2  -  P 
end 

RL\  =RU  uRL2 
Return  RLl 
end 

Lemma  I:  Find_Subcubes(E.  e)  returns  a  set  of  terms  RLl  that  contains  exactly  the 
nonredundant  set  of  subcubes  common  to  £  and  e. 

Proof  sketch:  If  cubes  and  e,-  are  disjoint,  then  the  coordinate  pi  operation  between  Ej 
and  e,'  returns  a  y  in  at  least  one  position,  and  2  =  Ej  pi  ei  ~  Z  and  so  makes  no 

contribution  to  the  set  of  subcubes.  Otherwise,  the  coordinate  pi  operation  returns  the 
term  Z  that  describes  the  cube  common  to  E^  and  e^. 

The  algorithm  next  tests  whether  Z  =  «,  or  Z  =  Ej.  If  Z  =  e,-,  then  this  implies  that 
each  X  in  e,  matches  with  an  x  in  Ej.  but  Ej  may  have  more  x’s  than  e,-.  In  this  case.  Yj 
s  0  and  e,  is  a  subcube  of  Ej,  so  e^  is  removed.  Similarly,  if  Yj  -  0,  then  £j  is  a  subcube 
of  e,  and  is  removed.  Similar  reasoning  holds  for  other  cases.  I 
Lemma  2:  Let  ri  (tz)  be  the  size  of  RLl  (RL2).  The  redundancy  checking  in 

Find_Subcubes  requires  up  to  <1/2  +  ^2  ] checks  and  leaves  only 
nonredundant  residual  terms  in  RL2. 

A  straightforward  approach  to  redundancy  checking  requires  as  many  as  ^  2  ) 
containment  checks. 

As  an  example,  let  links  fj  =  Oq  and  f2  =  ql  be  faulty  in  Q2.  Using  TV  notauon, 
C2  =  (lx,  xl.xO)  and  £)(/2)  =  (lx.  Ox,  xO).  Applying  £ind_Su/;cubcs(C2,  we 

get£Ll  =  { Ix,  xO),  and  RL2  =  (01, 00)  before  redundancy  checking.  The  redundancy 
checking  of  the  procedure  removes  term  ‘00’  from  RL2.  Finally,  we  obtain  three  non- 
redundant terms  (lx.Ol.xO). 

The  procedure  below  reduces  the  amount  of  computational  efforts  considerably  over 
Algorithm  1 .  Note,  the  order  in  which  the  £  vector  is  processed  may  appreciably  affect 


ihe  total  amount  of  work  done  by  the  algorithm.  In  reliability  literature  [13], 
preprocessing  according  to  the  increasing  order  of  cardinality  of  terms  is  found  to  be 
beneficial  from  the  complexity  vicwpoinL  In  this  ease,  a  similar  sorung  will  help  find  a 
maximum  size  d-subcube  and  also  contain  the  computational  effort  of  the  procedure. 
Thus,  we  obtain  the  following  algorithm. 

Algorithm  2. 

Input:  List  of  subcube  and/or  link  faults /j./2 . /„,. 

C2=D(f{) 
for  i  =  2  to  m  do 
begin 
e^D(fi) 

Ci^  1  =  Find^Subcubes(Ci,  e) 

C^^l  =  Sonic, +i) 
eal 
Return 

Theorem  2:  Given  a  list  of  faulty  subcubes  and  links  in  an  n-dimensional  hypercube. 
Algorithm  2  identifies  all  maximal  dimension  fault-free  subcubes. 

Proof:  D(f)  generates  a  covering  for  the  minterms  not  contained  in/|-.  Thus  C2  =  D{f]) 
has  the  effect  of  C2  *  xx...x  # /j  or  C2  =  xx...x  S /j,  where /j  is  a  subcubc  or  link 
failure,  respectively.  By  Lemma  I,  Find_Subcubes(jCi,  D(fi))  computes  a  set  of  terms 
corresponding  to  the  maximal  subcubes  contained  in  both  C,-  and  D(fi)  with  the  redundant 
terms  removed,  which  again  has  the  same  effect  as  C,-  # /,  or  C,  $/,,  where  /,  is  a 
subcube  or  link  failure,  respectively.  Thus.  Algorithm  2  computes  the  same  result  as 
Algorithm  1.  By  Theorem  1,  the  theorem  is  proved.  I 

5.  Complexity  Analysis 

We  now  analyze  the  time  complexity  of  the  algorithms.  For  a  given  list  of  m  faults. 
Algorithm  I  computes  C,>i  =  C,  tt /j  or  C,  S /,,  depe'  ding  on  whether/,  is  a  subcube  or 
link  fault,  on  each  of  m  iterations.  Assume  that  cor  'luting  for  one  cube  and  for 
one  cube  can  be  done  in  one  time  step  for  each  resulting  cube,  so  our  object  is  U) 
bound  the  number  of  cubes  in  C,>i.  Wc  consider  subcube  faults  only,  assuming  that 

each  fault  is  a  0-subcube  (that  is,  a  node),  as  this  will  produce  the  worst  case  bounds. 
Initially,  C|  =  {x  x  x  ...  x).  By  definition  of  the  #  operator,  in  the  worst  case,  C2  may 
contain  n  cubes;  x  x  ...  x  Oq,  x  x  ...  x  Oj  x,  ....  a„.j  x  ...  x.  where  Oj  e  (0,  1).  In 
the  worst  case,  the  set  of  cubes  produced  by  #/,  may  contain  at  most  as  many  cubes  as 
there  are  x's  in  c^,  and  each  of  those  cubes  will  contain  one  less  x  than  c^-  Hence,  C, 

may  contain  at  most  /ifn-l)  •  (n-i+1)  =  nl/fn-i)!  cubes,  where  n  is  the  dimension  of 
the  hypercube  Q„  under  consideration.  Actually,  with  n  symbol  positions  and  3  possible 

symbols,  {  1.x),  there  arc  at  most  S'*  possible  cubes.  Let  v  be  the  least  value  of  i 

such  that  n'./(n~iy.  >  3".  Note  that  /i(«-l)  •  -  -  (/»-<+ 1)  <  n\  So  the  time  to  compute  m 
iterations  in  the  algorithm,  for  m  <  v,  is 


I 


n! 


(/.-ij! 


And  ihe  time  lo  compute  m  iterations  in  the  algorithm,  for  m  >  v,  is 


*  n!  ”* 

I^"  =  0(rt'')  +  0((m-v)3'’)<0(m3''). 

i-l  ' 

In  terms  of  N  {=  IV]  =  2"),  the  size  of  the  hypcrcube,  the  lime  complexity  for  number  of 

faults  m  S  V  is  0(log'"iV)  s  o(N).  and  the  time  complexity  for  m  >  v  is  0{mN^.  where  /? 

2 

=  log23.  This  improves  on  the  lime  complexity  0{n{N-m)  )  of  Sridhar  and 
Raghavendia’s  algonthm  [14], 

If  we  wish  instead  to  compute  a  list  of  only  the  fault-free  subcubes  of  dimension  at 
least  n~k,  then  we  can  obtain  a  better  time  complexity.  If  m  S  then  we  again  obtain  a 
time  complexity  of  O(n^).  But  if  m  >  A.  then  we  obtain  the  following  time  complexity. 


^  n\  ^  n! 

This  lime  improves  on  Ozgtiner  and  Aykanat’s  algorithm  [81  that  requires 


mk 


time  to  locate  the  available  subcubes  of  dimension  n~k  or  greater. 


Algorithm  2  has  a  greater  worst  case  time  complexity,  but  a  better  expected  lime 
complexity.  As  noted  in  the  discussion  of  Algorithm  2,  the  pi  operator  and  equality  tests 
are  used  to  remove  redundant  terms  from  C,+i.  Thus,  the  number  of  terms  in  each  C,  of 
Algorithm  2  will  be  fewer  than  the  number  of  terms  in  each  C,  of  Algorithm  1.  In  the 
worst  case.  Algorithm  2  takes  more  time  because  of  the  time  spent  in  redundancy 
checking  and  sorting.  In  particular,  the  iih  iteration  takes  0(/i^‘)  lime  as  opposed  to 
0{rt)  lime  for  the  other  algorithms,  leading  to  an  overall  lime  complexity  of  0{r?^)  for 
m  S  V  as  opposed  to  0{n’^)  and  still  0(m3”)  for  m  >  v.  The  detection  and  removal  of 
redundant  terms  should,  however,  allow  improved  time  complexity  as  it  will  reduce  the 
size  of  each  Q. 
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Summary  A  Conclusions  ■  The  hypercube  topology,  also  known 
as  the  Booitin  /i-cube,  has  recently  been  used  for  tnuloprocessing 
systems.  Several  authors  have  analyzed  the  performance  of 
hypercube-based  systems  As  the  size  and  complexity  of  a  system 
increases,  however,  the  reliability  aspects  become  equally  important 
isid  should  be  included  in  the  performance  parameter  study  of  the 
system.  This  paper  descnbes  algorithms  for  computing  lower  bounds 
ort  two  reliability  metrics,  namely,  terminal  leliability  (TR]  and  net¬ 
work  reliability  [NR),  measures  often  used  for  packet-switching 
appUcanons  The  terminal  (network)  reliability  is  defined  as  the  pio- 
babiilty  tl  ti  .leie  eidsu  a  working  path  txasnecung  two  (all)  nodes  in 
the  stochastic  graph  model  of  the  hypcrcube.  Note,  there  are  oo 
known  polynomial  dme  algorithms  for  exact  computations  of  either 
TR  or  NR  for  the  hypercuhe,  thus,  lower  bound  computation  is  a 
beaer  approach.  The  paper  presents  polynomial  dme  ilgonihms  that 
obtain  improved  lower  bounds  for  both  TR  and  NR  than  known 
results.  Some  exisnng  techniques  for  TR  and  NR  eviluadon  am  dis- 
cussed,  and  lower  bound  results  are  compared  with  previous  bounds, 
niustndng  examples  are  provided  to  describe  the  proposed  ttch- 
aiqu'S.  Our  results  show  that  for  link  reliability  p  ■  0.93  or  better, 
which  1$  the  case  for  pracdcal  hypercube-based  tyttc.Ti«.  both  relia¬ 
bility  parameters  are  close  to  1.  These  results  further  verify  the 
iTsbustness  of  the  hypercube  architectures  under  Unk  failuies 

Keywords.  Combinatorics,  Hypercube.  Lower  Bounds,  Network  Reli- 
abiary,  Terrunal  ReitabiUty. 


I.  Introduction 

The  hypcTcube  topology,  also  known  as  the  Boolean  e-cube  or 
binary  «-cube  (4],  has  recently  been  used  for  iruldcomputer  systems. 
Each  of  the  2*  nodes  of  an  e-cube  is  a  computer  which  is  directly 
connected  to  n  neighboring  nodes.  References  (4,5]  discuss  lopologi- 
cal  pToperoes  of  a  hypcrcube  graph.  Performance  analysis  of 
bypercube-based  systems  has  been  addressed  in  (6.7].  As  the  size 
and  compleziry  of  a  system  increases,  reliability  aspects  become 
increasingly  important  parameters  :o  be  included  in  the  performance 
analysis  of  die  system. 

The  reliability  of  hypercubc-based  multicomputer  systems  1$ 
generally  evaluated  using  the  following  n  dels. 

1)  TerminaJ  Reliability  (TR)  Model:  The  system  works  as  long  as  a 
specified  input  (node)  is  connected  to  t  specified  output  (node). 

2)  Network  Reliability  (NR)  Model  (8]:  The  system  works  as  long 
as  ail  nodes  in  the  system  are  connected. 

3)  Task-based  Reliability  Model  (IJI];  The  system  works  as  long 
as  some  ounimum  number  of  cooneacd  nodes  are  avatlable  on 
the  system  for  task  execuoon. 

4)  Functional  Subcube  Model  (3):  The  system  works  is  long  as 
some  function?!  minimum  degree  subcube  extsis 

Tiiu  *or*  14  .uvTOrted  m  pw.  by  tfw  Air  Force  Otlce  of  Scjcrufie  RejcartJi 
under  fmr?  AFO:.^ -91 -0025 


The  models  nuy  be  evaluated  by  assuming  that  only  the  nodes 
can  fail  while  the  links  are  reliable,  or  only  die  links  can  fail  while 
the  nodes  are  perfect,  or  both  nodes  a.nd  links  can  fail  Ir,  addition, 
each  model  assumes  that  node  /  link  failures  are  statistically  indepen¬ 
dent.  A  conventional  rt'iabi'uty  modeling  approach  usually  considers 
a  stochastic  graph  model  widi  failing  links.  In  hypcrcube  stmeturts, 
this  approach  is  ivarranted  to  verify  the  sturdiness  of  the  network 
model  of  the  ujpoiogy.  Moreover,  in  t  hypercube,  the  oeiwork  relia¬ 
bility  ia  obvious  with  perfect  links  and  failing  nodes.  The  NR  with 
node  failures  is  just  ihe  probability  that  all  nodes  are  operanonal, 
which  is  the  product  of  the  node  reliabiltaes. 

In  general  networks  •*<'  problem  of  computing  any  of  the  first 
diree  measures  exactly  is  kP-complete  [9]  and  so  will  require  an 
unreasonable  amount  of  compulation  tune.  It  is  often  possible,  betw- 
ever,  to  much  more  efGcieniiy  obtain  upper  and  lower  bounds  oo  a 
reliability  measure.  Tbe  lower  bound  is  of  greater  interest  u  the  sys¬ 
tem  will  be  at  least  this  reliable. 

In  this  paper,  we  investigaie  the  TR  and  NR  models,  and  coo- 
xider  the  link  failures  case.  We  propose  new  techniques  to  improve 
lower  bounds  for  both  the  TR  and  NR.  These  techniques  pr^ace 
better  results  and  are  computable  in  otne  polynomial  in  the  order  of 
the  dimeasicin  of  the  hypticube. 

Tbe  layout  of  the  paper  is  u  follows.  Section  2  presents  tbe 
background  material  which  includes  notations,  assumpoons,  and  dis- 
casxion  on  earlier  work  done  on  the  topic.  Section  3  ^oposcs  an 
aigorithm  lu  obtain  a  oghier  lower  bound  on  TR,  while  Section  4 
develops  an  algorithm  for  an  improved  bound  on  NR.  In  Section  3. 
we  present  tbe  bounds  computed  for  TR  and  NR  using  the  new  tech- 
oiqoes,  and  compare  the  results  with  those  obtained  by  previous 
methods. 


2.  Background 

2.1  Notations  and  Assumptions 

Notations 

C,  n-cabe  or  hypercube. 

ft  ft)  number  of  vdes  (links)  in  C, ;  *  2*.  f.  «  'i2*'' 

p  (g)  link  reliability  (unreliability);  p->-e  »  I 


ST(C,)  notnber  of  rnimmal  spanning  trees  in  C, ;  ST(C,) 


«(C..p) 

ftlriC..p) 

Ci 


tl 


r-i 

a  function  representing  (1  -  p‘) 
terminal  reliability  of  C,  with  Unk  reliabiUry  p 
network  reUabUiry  of  C,  with  link  reiitbUity  p 
I  ih  group  of  paths. 

/th  path  of  group  C,. 
reliability  eoninbuBon  of  h,  j 
reUabiiity  coninbuaon  of  C, 


Assum^ions 


a)  A  link  IS  bidiret-oral. 
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t 

,j;s  jrt  sunsi.caiiy  ind  nive  tne  same  prooioiliry  of 

.  .^x  faiJures  ire  suas::caily  indepe.iaeni.  ind  nodes  ire  perfect. 

,  previous  "W-ork 

I  XertTunal  Reliability 

j,  (I  be  1  pair  of  source  ind  terminal  nodes  in  an  a -cube. 
^  ,  vkili  compute  a  iower  bound  on  TR  using  only  shortest  length 

Withou:  loss  of  geneniity.  consider  the  addresses  for  j  and  t 
y  and  ;'-l.  respecnveiy.  These  nodes  are  at  diameter  distance, 
a-r  s !  ir.  /)  minpaths  in  C,  [4],  The  terminal  reliability  (TR) 

'  ,  cube  IS  the  probabdiry  that  there  is  a  wortting  path  from  i  lo 

^irspur  iir-k  faiiuits.  This  probability  can  be  computed  by  first 
^  -leranng  the  n !  source-ieriTtmai  paths  for  the  C, .  and  then  using 
!  of  disjoint  products  (SDP)  technique  [9)  to  transform  it  into 

^  iriiogous  reiiabiliry  expression  for  the  TR  value.  The  numbers  of 
•ts  and  paths  for  the  cube  grow  expooennally  with  the  number  of 
.^js  :n  C, .  hence,  it  is  not  effinent  to  compute  TR  for  large  C, 
j  sg  die  above  method.  Alternatively,  compuang  a  lower  bound  on 
•7  offers  the  possibility  of  obtaining  an  iraponani  insight  into  ihe 
■lije  of  TR  at  a  much  better  computational  cost. 

I  (X  the  (i.  r)  minpaths,  n  of  these  are  disjoin:  A  lower 
I  y.'.i  on  TR  IS  computed  by  considering  only  these  n  disjoint  paths 
'  li.nrg  a  method  used  to  obtain  the  reliability  of  a  senes-parallel 
I  j,vxm  (91.  Let  77f,(C,,p)  be  the  lower  bound  on  TR  for  a  C,  with 
'  ,-x  reliability  p  by  considering  only  its  «  disjoint  paths.  Then, 

TR|(C,.p)- 1 -(1 -p‘r.  (1) 

j  'he  .-nodel  ts  straightforward,  but  rapidly  deteriorates  as  n  increases. 

I  ’  Secuon  3.  we  provide  a  aghter  bound  on  terminal  reliability  for  an 

i-cube. 

■M  Network  Reliability 

The  network  reliability  (NR)  or  alt-terminal  reliability  of  a  C.  is 
PC  probability  that  every  node  ui  the  a  -cube  can  communicate  with 
wy  other  node  despite  link  failures.  One  may  compute  the  exact 

nine  for  NR  of  a  C,  by  first  generaang  its  minimal  spanning  decs, 

sen  in  mm  using  these  to  oboin  NR.  An  a-cube  is  a  matroid  [9],  so 
(U1  is  able  to  oompute  NR  from  tbc  set  of  mnimat  span- 
!inj  bees,  which  is  of  sa  5T(C,).  in  linie  polynomial  in  5T(C,). 
.riorrunately,  ST{C,)  is  exponential  with  lespea  to  a.  Thus,  this 
tc.-ji!que  is  not  advisable  for  Urge  C,.  Further mexe,  there  is  no 
cio*n  polynonfiaJ  ome  algorithm  for  tbe  NR  problem  in  C,.  Obuin- 

1  iower  bound  on  NR  is.  obviously,  a  more  pracdcal  tpprowch. 

!  'atg  ft  al.  [I'l  ave  a  lower  boond  on  NR  by  considainf  the 

j  umoer  of  spuming  trees  in  C,.  Then,  they  weighted  each  spanning 
I  ra  gy  constlering  its  2*  -I  linlcs  as  operational  and  the  rest  of  the 
I  -'X  set  as  failed.  Their  bound  is  acceptable  only  for  very  small  p 
\  i'  Tne  reuability  polynomial  concept  (9)  may  also  be  used  for  NR 
i  ‘1  noted  in  [81,  however,  the  resultant  iower  bound  on  NR  is  oot 
^  -P'.  F'unh  era  lore,  the  mediod  based  on  this  concept  is  polynomial 
-t  only  in  terms  of  the  number  of  imks.  and  so  is  exponential  tn 
'•  cunension  of  the  n-cube.  BuUca  and  Ougan  [8]  got  a  lower  bound 
'  .ie  .NR  of  an  n  <ube  from  a  lower  bound  on  the  reliability  of  an 

-Cube,  which  m  nirr.  u  obtained  from  one  on  an  (n-2>-cube.  and 
'  cr,  unal  the  cube  is  small  enough  (base  cube)  that  its  reiiabiliry 
'■  5e  evaluaiec  exactly  Note.  BuUca  and  Dugan  used  a  Cj  as  the 
e.  for  which  the  exact  reliability  is  pven  ts: 

Nli(Ct.p)m4p\l~p)tp\ 

Bulk]  and  Dugan's  algorithm  [8]  computes  the  lower  bound  on 
^  of  a  C.  m  three  steps.  In  the  following,  all  2*"'  finks  connectuig 
■"  -.-I's  are  termed  as  extenor  finks. 

“  1,  Both  {/i-l)-cubes  art  operating  and  I  exterior  finks  operue, 

■■  I .  S  2* -'-2. 

1 

T}.M/!(C..,.p)^  Z  j  ^ 


itec  2.  .■\i  least  one  n-i  -cuoe  is  operaung.  ir.i  ore  extenor  ii.-jc  :j 
fiilcO. 


Step  3.  Ail  Z'"'  exterior  links  operate. 

r  3  •  A'^  icr. .  p  ^*zpp )  p  *  ■ ' 

The  lower  sound  on  network  reliability  of  an  <i-cube  n  oouuied  as 

AiR{c,.p)-n  »r2*r3  c. 


Computing  Tl  m  Step  I  takes  tunc  exponential  m  the  dinsension 
of  the  cube,  and  thus  is  useful  only  for  dimensions  /i  S  9.  For  large 
a.  refereoce  (8)  views  an  a-cube  as  a  1-cube  with  two  (2**  :- 
supemodes  connected  by  one  (2**‘>lin*.  Relcrence  (8)  next  genexai- 
Lies  the  approach  by  considering  the  a-cube  as  a  2-aibe  wliose  four 
nodes  are  each  (a-2)-cube$.  or  a  3-cube  whose  aghi  rxxles  are  each 
(a-3)-cubes.  and  so  on.  In  general,  an  a-cube  can  be  viewed  as  an 
(a-ev-cube  whose  2"*  nodes  are  each  * -cubes  which  are  connected 
by  2*'*  (2*)-finks.  Therefore,  if  the  onpnal  link  refiabtlny  is  p .  then 
the  new  link  reliability  is  MI-p)^,  which  is  the  probability  that  at 
least  one  of  the  2‘  links  is  operating  Using  this  approach.  BuUca  and 
Dugan  obtained  anoihci  lower  bound  on  network  reliability  of  an  a  • 
cube  as 


N«(C..p)-Nli(C..p)*'-  W«(C,-,.  l-(!-p)^).  (3) 

Note,  for  large  i  ind  p  (i  2  2  and  p  2  0.9).  the  teoood  term  of  the 
right  hand  tide  in  Equafion  (3)  is  approximately  equal  to  I.  Thus,  a 
fuoplified  version  of  (3)  is  given  as; 

HK(C,.p)mNKiC,.pf^  (*) 


Compaiing  Equaboe  (3)  or  (4)  with  Equafioe  «  is  clear  ebat 
Equanoo  (2)  provides  a  fighter  bound. 

In  whai  follows,  are  suggest  an  impipvement  on  Eqoaaoo  (2). 


Fast,  uxaif  die  imiplc  combinatonal  idesaty 


“  it 


««  eayreu  n  u: 


n  -  py*  V  iS) 

Noae,  Equatiae  (S)  is  polynomal  in  tbe  dunension  of  die  cube,  and 
bence,  wc  need  not  use  an  inferior  leaver  bound  oboiaed  by  Equation 
(3)  or  (4).  Seoond,  lo  improve  reliability  comribaoon  of  tbe  events  u) 
Step  3,  we  aternafively  censideT  die  followtng  soanons: 

Step  3a.  At  least  one  (a-iy-cabe  operates. 


r3e  -  (W{C^.  p)'+2AR(C..,.  pXl-NR (C.^,  p))\ 


Step  3b.  Bodi  (a-1)-cubes  frul.  but  if  we  cootraa  each  pair  of 
congruent  nodes  leduong  the  a-cube  mto  an  (a-lHsibc.  tbe  oom- 
bined  links  fbnn  a  connected  (a-l>cube.  Tbe  piobabtfity  it  gives  as 

r3b  -  (2  (1  -NR(C.-,.p))^  ■  Nir{C..,.p)Jp'‘-. 


However,  even  with  the  suggested  improvement,  tbe  resultant  So-wei 
bound  u  stUI  ixx  ught  for  p  <  0.9  and  a  >  10.  In  Section  4  are 
present  a  better  lower  bound  on  network  reliability  wbkh  pves  a 
significant  improvement  over  the  results  based  on  tbe  tbicussed 
methcxl 

TrrminaJ  Reliability  -  An  Improved  Bound 
To  improve  the  lower  bound  TR|(C,.p).  we  consider  «(a-2) 
minpaths.  lelected  and  ordered  using  a  roueng  method  described 
below,  and  then  use  a  Boolean  techruque  to  compute  tbe  refiabifity 
vaioc.  Note,  the  xleoed  a(R-2)  minpatbs  biclade  the  a  disjouu  paths 
used  for  genenting  7R|(C.,p).  ao  our  lower  boond  is  fighter.  Forth- 
ermore,  the  proposed  metbod  uses  the  mmpadis  k>  analytically  calcu¬ 
late  the  reiiabiliry  without  enumcrafihg  them.  Let  7tj(C..p)  be  die 
terminal  refiabifity  obtained  from  these  ■(*-2)  paths.  Consider  tbe 
minpaths  to  be  m  (■-2)  groups,  each  of  which  consiso  of  «  paths 
For  r  -  1.  Z  ■  .  ■-2,  and  J  «  1.  Z  .  let  represent  the 
jih  path  ui  a  group  of  paths  C,  In  wrhat  follows  we  present  the  root¬ 
ing  tlgonthm  duu  enumeimtes  the  ntn-l)  minpaths  in  C, 
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3.1  Routing  Algonthm 

Any  minpaih  from  source  node  0  to  lerminaj  nodi  2* -I 
traverses  n  links  in  n  different  dimensions.  Sumoer  the  dimensions 
by  0  to  «1-1.  Ut  be  d..,.  where  {dyi,.  ,4..,) 

■  !0.  1.  .  n-ll.  denote  the  path  from  node  0  to  node  T-\ 

obtained  by  first  traversing  a  link  in  dimension  d*  then  traversing  a 
link  m  dimension  d,.  •  ■  .  then  traversing  a  link  in  dimension  4,.|. 

Algorithm 

1)  />,  ,  -  0  I  2  3  (n-I) 

for  y  *  2  to  /t  do  begin 

Path  is  obtained  by  traversing  dimensions  in  the  order 
given  by  a  left  route  by  one  of  Pij-i- 

end; 

2)  for  i  «  2  to  n-2  do  begin 

for  y  »  1  to  a  do  begin 

P.j  has  the  same  first  traversed  dimension  as  P,j. 

The  last  f-l  traversed  dimensions  of  are  the  same  as 
those  in  The  remaining  a-i  dimensions  (that  is  the 
Zad  to  (a-i*!)ih  dimensions  traversed)  are  given  by  a  left 
route  by  one  of  the  dimensions  in  the  same  posibons  in 

end: 

end; 


3,2  Boolean  Techniques  ConcepI 

Boolean  tecnniqucs  for  rtiiabiliiy  evaluaoon  stan  wnh  a  sum  of 
products  espression  for  pathsets  and  conven  it  into  an  equivalent  lum 
of  disjoint  products  fSDPj  expression  [!ij.  In  the  SDP  form,  an  UP 
or  logical  success  (DOWN  or  failure |  suie  of  a  link  i  is  replaced  by 
link  neiiability  p  (unreiiability  q),  and  ihe  Boolean  su-m  (produai  by 
the  inthmetic  sum  (product).  In  other  wonls.  the  SDP  expression  is 
inteipreied  directly  as  an  equivalent  probability  expression  of  termi¬ 
nal  reliability.  If  P,  represents  a  path  idenofier  (an  UP  suit  of  a  link 
in  I  path  P,  has  1  in  P.,  while  t  ddn'e  care  is  represented  by  0),  ibe 
sum  of  products  expression  P  is  pven  by: 

P  -  C 

:ml 

where  n  denotes  the  number  of  minpaths  berween  (s .  i )  node  pair  in 
GfVJE).  Equation  (6)  is  modified  either  cuonically  or  conservanvtly 
to  generate  the  equivalent  SDP  expression.  Ptdisjoim).  The  conserva- 
bve  modificanon  is  usually  prtfeired,  since  u  is  more  efficient  com¬ 
pared  with  canonical  modificanon,  where  2'  events  are  requured  to 
determine  P(disjoini),  (/  is  (he  number  of  links  in  the  network )  A 
simple  way  to  generate  munially  disjoint  events  in  Equation  (6)  is  as 
follows: 


The  algorithm  generates  n(n-2)  paths  wiih  the  following  proper- 
nes. 

Property  1,  Paths  P,y  and  F,ji  trt  link  disjoint,  for  y  •  *. 

Property  2.  Paths  F,j  and  P,^  have  (i-ri)  common  links,  for  /  <*. 
Property  3.  Paths  P,^  and  P,y  are  link  disjoint,  for  y  •  /  . 

Observe  that  C^  compnses  the  a  disjoint  paths  considered  for 
Tlli(C,,p).  The  PRj(C,,p)  is  <»tnpuied  analyccally  by  using  the 
concept  of  Boolean  techniques  [9.1 1]. 

Example  1.  To  illustrate  Properties  1  through  3,  consider  i  Ct  with 
source  and  terminal  nodes  as  r  ■  OOOOOQ  and  i  •  111111.  respec- 
Qvely.  Using  above  algorithoi,  the  various  6(6-2)  sunpadis  grooped 
and  ordered  are  given  as  follows,  where  the  path  is  listed  to  the  left 
of  the  arrow  and  the  nodes  visited  (excluding  the  source  and  terminal 
nodes  )  are  listed  to  the  right  of  the  arrow. 

Croup  Cl; 

p,.,  .0  1  2  3  4  5  ->  oooooi-ooooii-ooom-oonu-oiiin 

Pu:  1  2  3  4  5  0  ->  ooooio-ooono-ooincM)inio-iiiiio 

P,j:2  3  45  0  1  ->  0001(XWX)UO(M)inOO-l  11 100-1  inoi 
P,,4;3  4  50  1  2  ->  001000-01  lOOO- 11 1000- 1 1  lOOl-UlOU 

p,j :  4  5  0  1  2  3  ->  oioooo-iioooo-noooi-iioon-noni 

Pij»;5  0  1  2  3  4  ->  100000-100001-100011-100111-101111 
Group  G2: 

Pli  :  0  2  3  4  1  5  ->  000001-000101-001101-011101-011111 
pr^;  1  3  4  5  20  ->  000010-001010-01  ioio-inoio-1  lino 
pjj:2  4  5  03  i  ->  oooi(»oio:oo-noioo-noioi-uiioi 

Pj.<  .  3  5  0  1  4  2  ->  OOlOOO-IOIOOO-IOIOOl-IOlOll-lllOIt 

Py  :  4  0  1  2  5  3  ->  OlOOOO-OlOOOI-OlOOll-OlOin-IlOllI 

Puj:5i  2  3  0  4  ->  100000- 100010- loono- 101  no-ionn 
Group  G3; 

P,,  :03  4  2  1  5  ->  000001-001001-011001-011 101-011 1 11 
Pjj:I  4  3  3  2  0  •>  00001043!'r'-..^il00l0-in010-uui0 
P,j:2  5  0  4  3  I  •>  OOOlOO-lQOlOO-lOOlOl-UOlOl-llllOl 
P,4  .  3  0  1  3  4  2  ->  ooioocwwiooi-ooiou-ioioii-iuon 
P,j:4  1  20  5  3  ->  0l000(w)100ia010u0-0101  IM lOl  11 
P,^:3  2  3  1  04  ->  100000- 100100- 101 100- 101 110-1011  n 
Group  04: 

P,,.  0  4  3  2  1  5  ->  OOQOQl-OlOOOl-OllOOl-OniOI-Oinil 

p.j:  1  5  4  3  20  ->  ooooio-ioooio-nooio-iiioio-nnio 
P4j:20  3  4  3  1  ->  000100-000101-100101-110101-111101 
P44:3  105  4  2  ->  0010004X)1010-0010H-10101I-niOll 
p4j'4  2l0  5  3  ■>  0100004510100-010110-010111-110111 

p,4:5  3  2  i04  ->  lociooo-ioiooo-ionoo-iomo-ionn 


where  denotes  a  DOWN  event  of  path  P,,  The  probability  of  UP 
(operanonal)  for  an  irt  term  P,  ^  •  P‘,.1  can  be  evaluated 

using  condibonal  probability  and  standard  Boolean  operunons  as; 

Pt(P.)  Pr(r7  P7  ■  P,}-  PrCP.)  fl  ^  t^y)- 


Here,  Ej  represents  a  coodinooil  cube  fll)  and  defines  condibons  for 
a  path  idenofier  P,  DOWN  given  P,  UP  (operabooaJ).  For  the  equal¬ 
ity  to  bold  good.  Ej 's  naisi  have  aon-redundaat  and  munially  dtsjtMt 
terms.  Tl*  probability  of  the  first  event  Pr(P.)  can  be  detennined  in 
a  nnughtforward  manner  tinct  fiilares  are  assumed  to  be  sabsbcaily 
independent.  The  various  «enos  within  £/s  will,  in  general,  not  be 
disjoint  {9.1 1],  This  necessiaies  making  fy 's  mutually  disjoint  before 
we  generate  the  equivaJent  probahiliiy  expression. 

3J  Improving  Lower  Bound  on  Tertoi^  RcUsbillty 
(a)  Boolean  Approach 

Let  P.yy  be  die  path  idenofier  for  path  F,j,  The  nia  of  products 
expression  P  for  the  n(e-2)  minpaihs  is  given  as: 

^  I  S  1  S  *-2  .  1  S  /  S  e.  (8) 

‘J 

Using  Equabofi  f7).  the  SDP  form  for  (8)  is  obtained  as: 

■  ■■  * Fij,  •  ■  p|j-ii *  ■  ■  ■ 

Note,  the  ith  «  terms  ui  Equation  (9)  give  (he  reliability  contnbunon 
of  the  paths  m  C,.  Let  R,  denote  the  reliability  conmbunon  of  G,, 
The  lower  bound  on  reliability  of  the  C,,  TRjIC, .  p),  is  given  as; 

TRi(C..p)-*£p,.  (10) 


By  Equabon  (1)  and  Properry  1.  •  1  -  (1  -p*)“-  For  I  i  2  aid 

1  S  >  S  «,  the  disjoint  expression  of  a  term  P,y  FZJ  i4 

evaluated  using  condibonal  probability,  standard  Boolean  operaoon, 
and  Ppoperbes  I  through  3  u; 


Pr(P.^) 


(H) 
(12) 
rZx)\  (13) 


Equanon  (II)  is  easily  coti^ted  as  Pr(P,j)«p’.  Let  P,  ■  P, 
denote  the  portion  of  P,  remaining  after  any  links  m  common  wit.i; 
P*  are  removed.  For  any  iJej  and  i  •  /,  P,yP.j  and  P,j-P,j  are 
link  disjoin,  so  Equanon  (12)  is  computed  as: 
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I  TR3(C,.,,P) 

■  s,r 


S 

TPl3{C,.v.  P) 

Pguro  1.  A  2-motael  of 
n-cuoe  for  TR  prooiem 

By  Property  2,  the  reUabiliiy  concribubon  of  (15)  is  given  as: 

nd-p-'").  (16) 

r»l 

Utilizing  the  properaes  of  the  selected  paths,  each  probability  com¬ 
ponent  of  (13)  has  equal  value.  Let  MG,  denote  the  probability  of 
each  of  them,  i.e., 

MG,  =  Pr(/f;7 ■  ■  ■  FJJ).  1  S  k<  j. 

By  Property  2,  and  tecursively  ublmng  a  Boolean  identity 
aP,Sj  at>Bi  «  aF  *  obA^  ■  we  obtain; 

MC,  -  d>}  ♦  p^.-i  (?  p®.-j  (  ■  •  • 

♦P«.-«-t)  (?  +p4>.^))  ■  ■  ■  )).  (17) 

Similarly,  (14)  is  cotnputed  as  MC^.x.  Thus,  the  teliability  contribu- 
Don  of  path  P,j  is  given  as: 

•p'  nc’  )  {(MC.V-'  (MCi.,)*-'  1,  (18) 

/•I 

where  AfC,  «  O,  and  MCi  ■■  O]  p^l.f  Thus,  the  reliability  oon- 
mbution  of  Ci  is  obtained  as: 

-  P*  fi  (1  -  P**'"  )  ivMCxr'  (MGx.xT'^l  (19) 

Example  2.  Consider  a  C*,  P|  ■  1  -  (I  -p*)‘.  MGx  ■  ^a  ■  1  -p*. 
and  MGi  ■  (1  -  p‘)  * p’(l  - p*‘V-  By  Equation  (17)  we  obtain;  MGj 
«  «>r*p  ^6-2(4 ♦a- A  MGi  ■  ®rhp’^*-2(4+P®*-i(4+P  ♦«-.’))•  Util¬ 
izing  Equation  (19),  the  reliability  contributions  P, 's  are  obtained  as; 

-P*  ri  0  -P*’'"  )  IfWGjr'  (WC,)*-'  . 

p-»  /•! 

«)  -  p*  n  (1  -p*"'" )  ifwc,)/''  (WG,)*-f . 

«.-p‘n  u -p*""  )  i(wc.y-' wcj)*-'  . 

^.1  ,-l 

For  p  ■  0.9.  one  gets  AfC,  -  0.468559,  HfCj  -  0J185797,  MCt  ■ 
0  236268,  and  AfC,  »  0.224167.  We  also  obtain  Pj  ■  0.989418.  Pj 
*  0  010038.  Rj  »  0  000371,  and  Kt  «  0.000037,  and  hence  Equation 
CO)  gives  rPjfCj.  0.9)  -  0.999863. 

(b)  A  2-cube  Modei  for  TR 

Consider  the  C,  constructed  of  two  C..|’s  that  are  connected  by 
2'*'  links.  We  call  these  links  as  exterior  links,  and  the  links  within 
each  C.,,  as  inicnor  links. 

Theorem  1.  Given  a  bound  on  TR  for  C,_t,  TRj(C,.i,p),  a  2-cube 
based  lower  bound  on  TR  is  expressed  as: 

rR,(C..p)-2p  rR,(C..,.p)-(pTR,(C..„p))^. 

Proof  The  lower  bound  on  TR  of  the  C,  if  computed  by  including 
only  rwo  extenor  links,  i.e..  a  link  chat  connecu  the  nodes  C  of  the 
two  fn-i)-cubes.  and  a  link  which  connects  the  nodes  2*"-l  >n  the 
subcubes  (see  Figure  1),  Note,  the  (r,  t)  {at  the  /•-cube  in  the  tguie 
IS  (ti,  ij),  and  the  two  exterior  links  considered  are  (ri.X]),  (ii.ti) 
with  link  reliability  p.  The  probabOiiy  of  Cri,  ri)  or  Cti.  tj)  connec¬ 
tivity  is  given  as  the  lower  bound  on  TR  for  C..,.  Thui,  (i,.  tj)  con¬ 
nectivity  1$  given  as  the  TR  for  a  Cj  with  link  reliability  for  (/,.  r,), 
f^j-  <i)  given  ti  TRj(C,.|.p)  and  the  odier  rwo  links  have  probability 
of  success  p  O 


(n-i)-cube  (n-i)-cuDa 


Fgure  2.  Two  soaions  of 
n-cube 
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Theorem  2.  The  lower  bound  on  terminal  reliability  for  C, , 
77? (C..  p).  >s  computed  u 

tp(C,.p)m2p  nr(C..i.p)-(pr/r(C..,.p)f . 

what  TJf’(C..,.p)  -  max  (7Rj(C..i,p),  77f>(C,.„p)l. 

Proof.  Obvious.  Q 

The  equations  for  lower  bound  on  T7?  in  Equation  (10). 
Theorems  1  and  2  can  each  be  computed  in  time  polynomial  in  the 
dimension  of  the  cube. 

4.  Improved  Lower  Bound  on  Network  Reliability 

Let  us  consider  an  n-cube  C,  as  two  (n-l)-cubes  connected  by 
2*"'  links.  We  evaluate  the  lower  bound  on  leiiability  of  a  C.  from 
(he  known  lower  bound  on  reliability  of  a  C,.|.  The  proposed  algo¬ 
rithm  divides  the  ptoblem  into  three  mutually  disjoint  cases.  In  addi¬ 
tion.  the  events  in  each  case  are  also  tnutuiily  ^sjoint  among  them¬ 
selves.  Thus,  the  lower  bound  on  reliability  can  be  calculated  as  the 
sum  of  the  lower  bounds  on  reliability  obtained  from  the  following 
three  cases. 

Case  1:  Both  (/i-l>-cubes  and  only  one  exterior  link  operate. 

The  graph  model  for  Case  1  is  shown  in  Figure  2.  Since  both  C..|’s 
operate,  only  one  good  exterior  link  is  needed  to  make  the  two  sub- 
cubes  combine  into  a  connected  C, .  The  disjoint  expresskx.  of  Case 
1  it  given  as; 

NP(C..i. p)*  ■  (p-fpq4p4**peV  •  •  •  (20) 

The  events  in  Case  1  are  ail  naituaJly  disjoint  tince  wt  consider  me 
operating  exterior  links  one  at  a  time,  Le^  an  exterior  link  operates 
when  the  links  previously  considered  good  fail  Since  OSp.p  S  1, 
Equation  (20)  can  be  reduced  to; 

/VR(C..,.p)*(l-4*^).  (21) 

Note,  Cmsc  I  indudea  STcAiei(C,)  -  Sr(C..,)  STcC..,) 

minimum  spanning  trees. 

CASE  2:  All  2**'  exterior  links  operate. 

In  Cue  2,  a  C,  is  coruiecred  if  at  least  one  C,.,  aperaies.  When  all 
exterior  links  are  good,  the  congruent  nodes  m  the  two  C,.|’s  can  be 
combined  to  form  an  ((i-l)-cube,  C,.|,  in  which  each  pair  of  adja¬ 
cent  nodes  is  connected  by  double  links.  (Consider  the  C  ]  shown  ui 
Figure  3.  Figure  4a  gives  the  graph  representanon  for  <2ase  1  The 
reliability  of  the  C  ,.,  it  given  as  the  lower  bound  on  reliability  of  a 
C,.|  with  link  reliability  p*  ♦  2pe  («  I  -  4’).  since  the  C  ,.|  is  con¬ 
nected  when  it  contains  at  least  one  working  spanning  tree,  which 
may  have  links  from  either  or  both  C,.,’s.  Thus,  for  p'  •  we 
have  the  following  expression. 

p’^”  (NR(C..,.p')-N«(C,.,.p)*)  (22) 

Events  in  2  are  mumaUy  disjoint  In  addition.  Case  2  is  mutu¬ 
ally  disjoint  from  Case  1.  The  events  considered  in  M(C,.,.p') 
include  the  possibilities  ihai  both  lubeubes  operate,  lo  we  subtract 

NltHc,.,.p)  to  make  Case  2  disjoint  from  Case  1.  The  number  of 
mmimum  spanning  trees  used  in  Case  2,  sroun<C,).  is 

me..,) 

case  3:  2  S  i  S  2”'-I  exterior  links  and  one  (n-IVeube  operate 
Case  3  is  intraoable  for  large  C.’i,  to  we  compute  t  lower 
bound  on  reliability. 
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Figyra  4,  Merging  nooes  m  CuDS  C.  for: 


CASE  3A  forf  -Z***-; 

Figure  4b  depica  die  graph  ttpresenucon  of  Case  3A  for  a  Cj. 
As  an  example,  the  figure  shows  an  event  in  which  links  8.  9,  and  10 
operate  while  link  1 1  fails.  Let  an  isolated  node  be  defined  as  a  node 
in  a  disconnected  subcube  C,.t  which  has  a  failed  exterior  link  as 
one  of  ia  links.  The  Cj  is  connected  if  one  cube  operates  and  the 
isolated  node  is  connected  by  at  least  one  of  ia  interior  links  (links  6 
and  7).  However,  we  also  have  included  the  case  in  which  the  two 
(n-l)<ubes  operate.  To  make  this  case  disjoint  with  Case  1.  we  sub¬ 
tract  The  reliability  contribution  for  this  case  is  given 

as: 

2  2‘-‘  p'"-'  q(N/HC..i.p)  d-c*-')  -W/KC..,.p)b.  03) 

Case  3A  considers  5To«cm(C.)  •  5T(C..,)  •  (n-D  -  2*  minimal 
spanning  trees. 

CASE  3B  for  i  -  2-'-2 


The  two  isolated  nodes  can  be  at  a  distance  one  or  more.  We 
consider  the  reliability  expression  for  tha  ase  in  two  groups. 

The  two  nodes  are  at  distance  1;  Figure  4c  presena  the  graph  model 
of  this  case  for  C,.  The  figure  shows  the  evena  when  links  8  and  9 
operate,  while  Unks  10  and  11  fail.  The  lower  bound  expression  for 
this  case  is: 

2X'  •  p)  •  (l-tj**-'*-*)  •  p 

Hf  •(l-4*-V)-Af«(C..,.p)*)).  04) 
where  I'  is  the  number  of  links  in  C,.i. 

Since  the  two  nodes  are  at  distance  I,  there  is  a  link  connecting 
them.  Hrst,  consider  the  link  opetahonal.  Thus,  the  two  nodes  can  be 
merged  into  a  node  X.  The  C,  is  connected  if  node  X  is  connected 
by  at  least  one  of  iu  interior  links.  Second,  we  txiosider  the  connect¬ 
ing  link  u  failed.  In  this  case,  each  of  the  two  nodes  should  indivi¬ 
dually  be  connected  by  at  least  one  of  is  imoicr  links  However,  we 
have  also  considered  the  case  in  which  both  C,.|'s  operate.  Thus,  to 
make  this  case  dis  lint  from  Case  1.  we  subtract  W*(C.,|.  p)^  This 
case  considers  ST{C,.,)  (n^~3it^*2/i)  7"'  minimal  spanning  trees. 

The  two  nodes  are  of  a  distance  of  more  than  1:  Fgure  4d  presena 
the  graph  model  of  this  case  for  Cj.  The  figure  shows  the  evenu  in 
which  links  8  and  10  operate,  while  Unks  9  artd  11  fail  The  lower 
bound  expression  for  this  case  is: 
f  r-'  1 

2 -(I  K')P* 


(25) 


When  the  two  isolated  nodes  are  at  distance  of  more  than  one  6orii 
each  other,  the  » -cube  is  connected  when  each  of  them  is  connected 
by  as  least  one  of  iu  interior  Units.  Again,  we  have  consadered  the 
case  in  which  both  (n-IXubes  operate  which  acooona  for  the  sub- 
tractiocL  The  number  of  minimal  spanning  trees  considered  in  this 
case  is  ST(C,.i)  (n-l)*  cy'-n)  2*''  and  thus,  overall.  Case  3B  uses 
STcAUjplC.)  «  ST(C,_i)  2*''‘(2**‘('«*-2j«+1)-«’*<i)  aanimaJ  spinning 
irees- 

CASE  3C  for  2  S  t  S  2-‘-3 

The  problem  of  obtaining  an  exact  expression  for  this  case 
becomes  intracuble,  especially  for  large  a.  However,  one  can  get  a 
lower  bound  on  the  reliability  contnbuhon  of  this  ease  by  considering 


a  lower  bound  on  the  .'ni/iurjil  spanning  trees  ihai  .htve  not  Seen 
included  in  the  .’eliabiliiy  evaluated  so  far.  and  lake  a  lower  bou.nd  on 
reliability  for  each  of  them,  i.e.,  muinply  the  nu.mbcr  by 
where  Thus,  a  lower  bound  on  reiubiljty  expression  for 

Case  3C  is: 

(ST(C,  i(C,)-JT£;^j(C,  ) 

~T7c*iei*(C.)-5Tc*jfja{C. ))  p''"'  r26) 

Combining  ail  cases, 

N/t(c..p)i  m>*inx23y*a*}*as}*<26).  (27) 

The  equation  for  lower  bound  on  in  Equation  (27)  can  be  com¬ 
puted  in  ome  polynomial  in  the  dimension  of  the  cube. 

Table  1  shows  the  comparisons  of  the  exaa  reUabiliry  of  Cj  for 
the  cases  described  obtained  by  CAREL  [11],  and  die  lower  bound 
resuJa  produced  by  the  proposed  algorithm. 

5.  Results  and  Comparisons 

Tables  2  to  3  show  the  bounds  for  TKf,  TH^.  TFy,  and  the  tight¬ 
est  lower  bound  TR  far  various  C.’s,  n  •  3.  4,  ....  16  and  various 
values  of  p.  For  «  ■  3,  we  have  exact  reliability  values.  As  expected, 
TRi  is  always  tighter  than  Tfij  since  the  minpaihs  considered  in  74, 
are  only  a  subset  of  the  ones  for  TRj.  The  lower  bound  obuj.ned  by 
the  2-cuhe  model.  TR,.  performs  worse  than  Tkj  for  p  >  Oi.  How¬ 
ever,  rSj  is  fighter  than  TRj  for  p  >0.6  as  shown  in  Table  3.  For 
0.6  <  p  S  0.1,  77?  j  is  tighter  than  TR  j  for  some  C.'t  As  shown  in 
Tahiu  2  to  5,  TR  (computed  by  Theorem  2)  always  provides  the  best 
bounds  for  any  values  of  p  and  x.  As  the  resulu  show,  for  x  s  16 
and  p  i  0.6,  TR)  >4  always  as  fight  as  TR,  and  thus  it  u  suggested  to 
use  the  2-cube  model  for  TR.  On  the  contrary,  for  p  >  0.1.  TR^  is 
equal  to  TR  for  x*  3.  4.  _  16,  and  hence,  Equafion  (10)  a  sufficient 
m  obtain  TR. 

Table  6  preseno  the  comparisons  of  the  best  known  tower 
bcxinds  on  NR  obtained  in  (8]  with  the  new  lower  bounds  generated 
by  OUT  proposed  technique  for  x  *3,  4,  ...  ,  16.  Note,  we  have  used 
the  modified  Bulka  and  Dugan  algonthm  [8],  U.,  by  repUang  ri 
with  Equation  (S).  As  shown  in  the  table,  our  lower  bounds  are 
fighter  than  the  Bolka  and  Dugan  bounds,  espemaliy  for  x  >  10  and 
p  <  0.9. 
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Table  1. 

Reiiability  for  C  j  -  Exact  versus  Lower  Bound  for  p  a  (L9 


I  CASE 

aSpanmng  Trees 

Reiiabilitv 

Exact  1  Lower  Bound 

Exact 

Lower  Bound 

;  1 

64 

64 

0.898045 

0.898045 

2 

32 

''  32 

0.066445 

0.066445 

3a 

64 

64 

0,023380 

0.023379 

3b 

160 

160 

0.002488 

0.002487 

3c 

64 

64 

0000306 

0  000306 

Table  1. 

Lower  Bounds  Terminal  Reliability  for  a -cubes  for  p  s  045 


n-cube 

7R, 

77?, 

TR, 

TR 

3 

0985002 

0.983002 

0.985002 

Q.9t5Qfa 

4 

0.947798 

0.987683 

0.973513 

0.987683 

5 

0.946725 

0.993617 

0.970239 

0.995617 

6 

0.941613 

0.998009 

0.969271 

0.99*009 

7 

0.933169 

0.998899 

0.968982 

0.99*899 

1 

0*921529 

• 

0.999254 

0.968895 

0^99254 

9 

0.9U6632 

0.999371 

0.968869 

a999r7l 

10 

0.888356 

0.999332 

0.968861 

0.999352 

11 

0.866609 

0.999221 

0.96*859 

0.999221 

12 

0.841372 

0.998965 

0.968858 

0.998965 

13 

0.812733 

0.998552 

0.968858 

0.998532 

1< 

0,780894 

0.997929 

0.968*38 

0.997929  ! 

0.746176 

0.997030 

0.96*858 

0.997030 

16 

0.709001 

0.995778 

0.968858 

0.995778 

Table  3. 

Lower  Bounds  Terminal  Reliability  for  ti -cubes  for  p  s  0.0 


n-cube 

TR, 

TR, 

TR, 

TR  ] 

3 

0.703912 

0.703912 

0.703912 

0.703912  j 

0.426048 

0.615492 

0.666317 

0.666317  i 

3 

0,332856 

0.624123 

0.639749 

0  639749 

6 

0249246 

0591016 

0.620338 

0.620358 

7 

0.180243 

0533064 

0.603886 

06038*6 

1 

0.126730 

0.462511 

0594908 

0594908 

9 

,  0.0*7128 

0587797 

0586480 

05864S0 

10 

;  0  038847 

0.315047 

0579951 

0579951 

n 

'  0  039192 

024&301 

0574858 

0574858 

1  12 

0  025811 

0.190723 

0570*63 

0570863 

13 

1  0  016846 

0.142791 

0567717 

0567717 

;4 

0010913 

0.104371 

0563232 

0563232 

15 

3007030 

0073123 

0563263 

0563263 

16 

0  004304 

0033085 

0561700 

0.361700  ( 

Table  4. 

Lower  Bounds  Terminal  Reliability  for  /i -cubes  for  p  a  0.7 


n-cube 

TR; 

r/t. 

i  FP, 

TP 

3 

0,865797 

0.865797 

0.865797 

0-865'9' 

4 

0.666554 

0  834887 

0.844S10 

I  0844810 

5 

0.601495 

0  869762 

0.833019 

0.869762  ; 

6 

0.528102 

0-875903 

0.826205 

0.875903  j 

7 

0.452070 

0.865198 

0  822206 

0865193  i 

a 

6.378122 

0.842149 

0.819837 

0.844479  1 

9 

0,309758 

0.808747 

6.818427 

0832830  ' 

10 

0.249144 

0.766476 

0.817585 

0.826095  } 

11 

0.197228 

0.716753 

0.817081 

0.822141  I 

12 

0.154017 

0.661019 

0.816779 

0.819799  1 

13 

0,118887 

0.600801 

0.816598 

0.818404 

14 

0,090877 

0,537781 

0.816489 

0.817571  I 

0.068895 

0.473799 

0.816424 

0.817072  j 

16 

0.051868 

0.410767 

0.816385 

0.816774  1 

Table  5. 

Lower  Bounds  Terminal  Reliability  of  /i -cubes  for  p  =  0.95 


n-cube 

TR, 

TR, 

TP, 

TP 

3 

0.999617 

0-999617 

0.999617 

0.999617 

4 

0.998816 

0.999873 

0.997463 

6.999873 

5 

0.999408 

0.999987 

0.997253 

0.999987 

6 

0.999654 

0.999998 

0.997232^ 

0.999998 

7 

0.999773 

1.000000 

0.997230 

1.000000 

8 

0.999835 

I.OOOOOO 

9 

0.999871 

1.000000 

0.997230 

1  000000 

10 

0.999892 

IDOOOOO 

a997230 

1.000000 

11 

0.999904 

1.000000 

0.997230 

1.000000 

12 

0.999911 

1.000000 

0.997230 

1.000000 

13 

0.999914 

1.000000 

0.997230 

1.000000 

14 

0999914 

1.000000 

0.997230 

1.000000 

15 

0.999912 

1.000000 

0.997230 

1.000000 

16 

0.999907 

1.000000 

0.997230 

1,000000 

Tablets 


Lower  Bounds  Network  RtilabiOly  fbr  x -cubes 


Link  Reiiability  «  0.9 

Link  Reliabiliiy  >  095 

Bulka-IXigw) 

New  Bound 

BuUu-Du|m 

New  BouraJ 

3 

0.987870 

0.990663 

0.998694 

0.99*909 

4 

0.994361 

0997593 

0.999754 
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Abstract 

Multistage  Interconnection  Networks  (MINs)  provide  a  good  communication  medium 
between  multiple  processors  and  memory  modules.  Previous  reliability  evaluation  efforts  for 
MINs  assumed  that  ail  failures  are  starisrically  independent  and  that  no  degraded  operational 
modes  exist,  though  these  assumptions  arc  inconsistent  with  realistic  conditions.  In  this  paper, 
we  relax  both  assumptions  and  provide  efficient  algorithms  for  terminal,  broadcast,  and  K- 
terminal  reliability  evaluation  in  the  Shuffle-Exchange  Network  with  an  extra  stage  (SE,NE),  a 
redundant  path  MIN.  The  shock  model  is  used  in  a  modified  form  to  incorporate  failure 
dependency  and  multiple  operational  modes  into  the  reliability  evaluation.  For  K-terminal 
reliability,  let  k  =  IX).  For  an  NxN  SENE,  the  algorithms  run  in  time  <9(log  N),  OOog  N),  and 
0(k  log  N).  respectively. 

1.  Introduction 

Parailcl’ computers  have  developed  very  rapidly  in  response  to  the  need  for  high  speed 
computing  in  many  applications.  Several  parallel  computers,  such  as  IBM’s  RP3  (Hsu  et  al., 
1987;  Wang  et  al.,  1989),  the  University  of  Illinois’  Cedar  (Konicek  et  al.,  1991),  and  Purdue 
University's  PASM  (Schwederski  et  al„  1991),  employ  a  multistage  interconnection  network 
(MIN)  to  provide  a  communication  medium  between  processors  and  processors  or  shared 
memory  modules.  A  MIN  consists  of  N  inputs  and  N  outputs  and  (typically)  n  stages  of 
switching  elements  (SEs),  where  n  ~  log2N.  An  SE  is  generally  a  2x2  crossbar  network  and 
provides  either  a  straight  (T-mode)  or  cross  (X-modc)  connection.  The  SEs  in  one  stage  arc 
connected  to  the  SEs  in  adjacent  stages  by  links  that  arc  arranged  in  panems.  Based  on  these 
patterns,  various  MINs  are  called  as  Omega.  Flip,  Indirect  binary  cube.  Modified  data 
manipulator,  baseline,  and  reverse  baseline  networks.  All  these  networks  are  topologically 
equivalent  (Bermond  et  al.,  1989).  To  improve  the  fault  tolerance  of  the  Omega  network,  an 
extra  stage  may  be  added  to  provide  a  redundant  path  from  each  input  to  each  output  Thus,  if 
one  path  fails  due  to  faulty  links  and/or  SEs,  an  input  can  still  reach  an  output  through  another 
path.  This  type  of  Omega  network  is  called  the  Shuffle  Exchange  Network  with  an  Extra  stage 
(SENE). 
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K-terminal  reliability  is  defined  as  the  probability  that  a  set  of  working  paths  exists 
from  one  specified  input  to  each  of  a  specified  set  K  of  outputs.  For  an  iVx/V  MIN  and  \K\  =  : 
{N),  it  is  also  kmown  as  lerminal  (broadcas:)  reliability. 

Previous  reliability  evaluation  algorithms  for  MINs  assumed  2-mode  fault  models 
(working  or  failed)  for  each  component  and  stansncally  independent  failures  (Botting  ei  ai, 
1989;  Vaxma  and  Raghavendra,  1989;  Cheng  and  Ibc.  1992;  Trahan  and  Rai,  1992).  These 
assumpnons  are  common  in  reliability  evaluation,  as  the  evaluadon  problems  are  intractable  for 
general  networks  (Colboum,  1987).  These  assumptions,  however,  fail  to  adequately  model 
real-world  situadons.  For  example,  Davis  et  al.  (1985)  and  Schwederski  et  al.  (1991) 
discussed  instances  of  dependent  failures,  or  fault  side-effects,  in  the  PASM.  The  2-mode 
model  leads  to  an  underesdmate  of  the  reliability  because  it  does  not  allow  a  degraded 
operadonal  mode  of  SEs.  The  assumption  of  independent  failures  leads  to  an  overestimate  of 
reliability.  Recently,  some  researchers  have  addressed  the  problem  of  incorporating  dependent 
failure  into  reliability  computations  for  general  networks  (Boyles  and  Samaniego,  1984;  Lam 
and  Li,  1986;  Le  and  Li,  1989). 

In  this  paper,  we  present  efficient  algorithms  for  terminal,  broadcast  and  K-terminal 
reliability  evaluadon  of  an  SENE  composed  of  idendcal  SEs,  allowing  muldmodc  and 
dependent  failures.  The  algorithms  assume  a  4-modc  model  of  an  SE;  a  fully  operadonal 
mode,  two  degraded  operadonal  modes,  namely  snick-at-T  mode  and  the  smek-at-X  irode,  and 
a  completely  failed  mode.  Moreover,  wc  assume  that  links  are  reliable.  We  modify  the  shock 
model  of  Boyles  and  Samaniego  (1984)  to  incorporate  dependent  failures  into  the  reliability 
evaluadon  algorithms  of  Trahan  and  Rai  (1992).  For  K-terminal  reliability,  let  1;  =  l/H-  For  an 
NxN  SENE,  the  algorithms  ran  in  time  (9(log  N),  0{log  N),  and  0{k  log  N),  rcspccdvcly. 

The  layout  of  the  paper  is  as  follows.  Secdon  2  describes  MIN  basics.  Secnon  3 
discusses  the  shock  model  and  explains  its  applicadon  to  SENE  reliability  analysis.  Secnon  4 
presents  results  of  the  algorithms  for  terminal,  broadcast,  and  K-terminal  reliability  of  SENE 
and  outlines  the  techniques  used  to  develop  them.  For  brevity,  we  present  the  results  only. 
The  denvanons  and  proofs  arc  discussed  in  (Wang,  1992). 

2.  Background 

The  Shuffle-Exchange  Network  with  an  Extra  stage  (SENE)  (see  figure  in  Cheng  and 
Ibc  (1992))  with  N  inputs  and  N  outputs  is  defined  to  be  a  MIN  with  log2N-f-l  stages  of  2x2 
SEs.  The  outputs  of  SEs  in  stage  i  connect  to  the  inputs  of  SEs  in  stage  i+I  by  a  shuffe 
connecdon.  for  i  =  0,  1, ...,  n-l.  Let  SE,^  denote  the yth  SE  in  stage  i,  where 0  S  i  S  n  and  0 
^  N/2-1.  To  help  develop  a  fault  model,  wc  consider  an  SE  to  be  composed  of  two  input 
controllers  (ICs),  two  output  controllers  (OCs),  and  a  central  controller  (CC).  Figure  1 
illustrates  the  structure  of  an  SE.  Paths  through  the  network  arc  established  by  a  path  request 
mechanism  in  which  a  CC  decides  whether  the  IC  should  transfer  the  input  to  the  OC  m  T-  or 


X-  connecnon  mode. 


3.  Shock  Model  and  Dependency  Analysis 

To  analyze  SE  failure  dependency,  we  use  a  shock  model  (Boyles  and  Samaniego. 
1984).  The  shock  model  and  the  event  based  reliability  model  (EBRM)  proposed  by  Lam  and 
Li  (1986)  are  idendcai.  The  shock  model  was  defined  for  2-mode  components.  We  generalize 
its  application  to  include  muirimode  SEs  in  our  reliability  analysis.  An  extension  of  the  EBRM 
to  multimode  components,  the  cause  based  multimode  model  (CBMM),  does  exist  (Le  and  Li. 
1989).  Our  model,  however,  is  more  simply  defined  as  an  extension  of  the  shock  model  or 
EBRM  than  as  a  restricted  application  of  the  CBMM.  Further,  we  can  explain  our  generalized 
model  in  terms  of  shocks  to  subcomponents  of  SEs.  Section  3. 1  explains  the  concepts  of  the 
model  while  its  application  to  SENE  analysis  is  described  in  Section  3.2. 

3.1.  SHOCK  MODEL 

The  shock  model  assumes  that  statistically  independent  shocks,  which  occur  with 
known  probability,  cause  the  failure  of  network  components.  When  a  shock  occurs,  it  causes 
the  failure  of  a  specific  component  or  set  of  components.  We  say  that  the  component  or 
components  are  affected  by  the  shock.  A  shock  affecting  a  single  component  is  called  an 
individual  shock  (IS),  while  a  shock  affecting  multiple  components  is  called  an  external  shock 
(ES). 

For  a  network  with  n  components,  up  to  Z'*-!  shocks  may  be  defined  theoretically. 
Most  shocks,  however,  may  never  happen  in  the  real  world.  Therefore,  we  need  to  consider 
only  those  shocks  whose  occurrence  is  reasonable  in  the  specified  real  world  conditions. 

3.2.  SENE  Failure  analysis 

To  apply  the  shock  model  to  a  SENE  in  which  SEs  may  operate  in  degraded  modes,  we 
will  expand  the  notion  of  an  IS,  while  leaving  the  notion  of  an  ES  unchanged.  Instead  of 
associating  a  single  IS  with  a  single  component,  we  will  associate  one  IS  for  each  failed  or 
degraded  working  mode  of  a  component.  Technically,  a  CC  having  a  stuck-at  logic  fault  can 
produce  a  degraded  working  mode  for  an  SE,  Class  1  shocks  (defined  below)  model  this 
scenano. 

For  ESs,  we  restrict  our  analysis  to  two  modes  only,  so  the  occurrence  of  an  ES  causes 
all  affected  SEs  to  fail.  SEs  that  are  far  apan  arc  unlikely  to  be  affected  by  a  single  ES.  In 
general,  the  classes  of  ESs  that  we  define  are  motivated  by  the  failure  of  one  SE  causing  the 
failure  of  other  SEs  due  to  the  links  connecting  them.  For  shared-memory  computers,  each 
processor  reads  from  or  writes  to  the  shared-memory  through  the  .MINs  and  so  communicanon 
flows  in  both  direcuons.  When  the  forward  (reverse)  pan  of  an  SE  is  failed,  this  SE  will  send 
garbage  messages  to  either  one  or  both  of  the  SEs  in  the  next  (previous)  stage  that  are 
connected  to  it  and  may  cause  one  or  both  of  them  to  fail.  Hence,  we  assume  that  an  extemai 


shock  causes  a  failure  in  adjacent  SEs  either  in  the  forward  direction  (towards  the  networK 
output)  or  in  the  backward  dirccdon  (from  fault  to  network  inputs).  In  a  pracncal  design,  a 
failure  of  an  output  (input)  controller  can  create  a  forward  (backward)  external  shock.  Such 
dependent  failures  have  been  noted  in  MINs  by  Davis  ei  al.  (1985)  and  Schwederski  e:  al. 
(1991).  In  the  terminology  of  Schwederski  et  aL,  the  shocks  that  we  have  desenbed  above 
correspond  to  fault  side-effects  with  forward  reach  of  I,  backward  reacn  of  I.  and  span  of  2. 
Qass  2,  3out,  and  3in  shocks  (defined  below)  model  this  scenario. 

To  help  compute  the  reiiabilides,  we  consider  the  following  classes  of  shocks.  Each 
class  of  shock  will  affect  a  certain  structured  set  of  SEs.  For  example,  a  Class  3out  shock 
(defined  below)  will  affect  an  SE  and  the  two  SEs  to  which  it  is  connected  in  the  next  stage. 
We  define  such  a  shock  for  every  SE.  Because  the  structures  affected  by  the  same  class  of 
shocks  are  the  same  and  all  SEs  are  identical,  the  probabilities  that  the  same  class  of  shock 
occur  are  idendcaJ  for  all  such  shocks.  The  probability  that  each  class  of  shock  occurs  may  be 
dmc  dependent.  The  purpose  of  our  work  is  to  find  out  a  reladonship  between  the  probabilidcs 
of  each  class  of  shock  and  the  reliability  of  the  whole  network. 

Qass  1  shock.  Exactly  one  SE  is  damaged  or  fails. 

A  Class  1  shock  is  an  IS  that  affects  only  one  SE.  We  modify  the  shock  model  as 
described  above  to  handle  stuck-at-T  and  stuck-at-X  modes  of  SEs.  Let  Zy(l,f)  denote  the 
shock  that  affects  5£,y  to  be  completely  failed,  let  Z,y(l,t)  denote  the  shock  that  affeca  SE.j  to 
be  stuck  at  T  mode,  and  let  Z^<I,x)  denote  the  shock  that  affects  5£,y  to  be  stuck  at  X  mode. 
The  probabiliti'es  that  the  three  Qass  1  shocks  occur  are  pf,Pi.  and  Px,  respecdvely.  Let  p^ 
denote  the  probability  that  the  three  Qass  1  shocks  affeedng  5£y  do  not  occur,  hence,  pn,  =  l- 
(P/+P,+Px)- 

Qass  2  shock.  Exactly  two  SEs  fail  simultaneously. 

A  Class  2  shock  is  an  ES  that  affects  two  SEs  connected  by  a  link.  Four  Class  2 
shocks  affect  each  SE  5£i/,  if  5£y  is  not  in  the  input  or  output  stages.  These  four  shocks  will 
be  denoted  as  Z,y(2,k),  where  k  =  1,  2,  3,  4  (Figure  2(a)).  If  SEij  is  in  the  input  stage,  then 
only  Zy(2,3)  and  Z,y(2,4)  affect  it.  If  SE,j  is  in  the  output  stage,  then  only  Z,y(2,l)  and 
Zij(2^)  affect  it  Let  the  probability  that  a  Qass  2  shock  occurs  be  pi  and  does  not  occur  be  <72 

=  1-P2- 

Qass  3qui  shock  and  Class  3in  shock.  Exactly  three  SEs  fail  simultaneously. 

Class  Sous  shock. 

A  Qass  3out  shock  is  an  ES  that  affects  an  SE  and  the  two  SEs  to  which  it  is  connected 
by  its  output  links.  Each  SE  in  a  SE.ME  is  affected  by  three  Class  3out  shocks  denoted  as 
ZtjOJc),  where  k  =  1.  2,  3,  except  for  SEs  in  the  input  and  output  stages  (Figure  2(b)).  If  S£,/ 
is  in  the  input  stage,  then  only  Z,/(3.3)  affects  it.  If  S£,,'  is  in  the  output  stage,  then  only 
Z,/(3.1)  and  Z,y(3.2)  affect  it.  Let  the  probability  that  a  Class  3out  shock  occurs  be  p3o  and 
does  not  occur  be  <730=  l-Pio- 


Class  3 in  shock. 

A  Class  3in  shock  is  an  ES  that  affects  an  SE  and  the  two  SEs  to  which  it  is  connected 
by  its  input  links.  There  arc  obviously  three  Class  3in  shocks  affecting  each  SE.  denoted  as 
Z,fi3Jc),  where  k  -  4,  5.  6,  except  for  SEs  in  the  input  and  output  stages  (Figure  2(c)).  5£y  is 
affected  by  only  Z,/3,5)  and  2,y(3,6)  if  it  is  in  the  input  stage  and  is  affeacd  by  only  Z,y(3.4)  if 
it  is  in  the  output  stage.  Let  the  probability  that  a  Gass  3in  shock  is  up  be  pzi  and  down  be  <73; 
=  1-P3i- 


4.  Reliability  Evaluation 

We  now  present  the  results  for  terminal  (TR),  broadcast  (BR),  and  K-terminal  (KR) 
reliability  measures.  Trahan  and  Rai  (1992)  developed  a  straightforward  algorithm  for  TR  and 
recursive  algorithms  for  BR  and  KR  of  a  SENE  under  assumpdons  of  independent  and  2-mode 
SE  failures.  They  noted  that  SENE  paths  form  a  simple  series-parallel  graph  for  TR  and  a  pair 
of  interseedng  binary  trees  for  BR  and  KR.  To  incorporate  dependent  and  muldmode  failures, 
we  follow  their  concept,  but  must  include  a  careful  and  much  more  detailed  accoundng  of  the 
shocks  that  may  affect  the  SEs  on  the  paths.  Note  that  a  single  ES  may  affect  one,  two,  or 
diree  SEs  on  the  relevant  paths. 

To  compute  TR,  note  the  following. 

a.  For  a  SENE,  there  exist  two  paths  from  each  input  to  each  output.  The  o  paths  share 
an  SE  in  the  input  suge  and  an  SE  in  the  output  stage,  but  are  otherwise  disjoinL 

b.  A  routing  tag  can  be  used  to  set  the  connecdons  in  switches  on  a  path  from  an  input  s  to  an 

output  d.  If  an  input  number  is  r  =  J1J2  •••  and  an  output  number  is  d*  did2  ...  d„,  where 
Sj,  S2>  •••’  ^4  dj,  d2.  are  the  bits  of  the  binary  representation  of  s  and  d 

respectively,  then  the  routing  tag  from s  to  d  is  r  =  rjr2  ■■■  where  r j ,  r2 . /"/j  are  given  by 

Tj-  ss  Si  0  d,-,  for  each  1  Si  ^n. 

Theorem  1.  For  an  input  s  and  an  output  d  in  an  (VxN  SENE,  the  terminal  reliability  can  be 
computed  using  the  equation  below  in  0(log  N)  time. 

n  n 

TR{N,s,d)  =  ip,  +  o^)  (Px-^Pw)‘=‘  ^72  (43^43/) 

n-l  n-\ 

iPz  +  Pw)  ?r(‘73<,d3i) 


I 


Theorem  2,  For  an  input  j  in  an  NxN  SENE,  the  broadcast  reliability  can  be  evaluated  using 
the  equation  below  in  (^(log  N)  dme. 

BRiN)  =  {(pt  +  Px)Pw~^  -  2p3o<73o~')^2''3‘'^^  3<73i<73o~'5^  W* 
where  a  =  PwiP2)  (^3o^3t)*’>  P  =  ^?Hf^2”‘?3o^3i»  BR'i^  can  be  computed  by 
BRXN)  =  a-q^o’hBRXN/Dr  +  2(p,  +  p^)p^~^a^^''^BRXNa) 

+  2p^‘^?3oIPw(P/^  -  1) 

The  base  case  BR  "(4)  is  defined  as 

BR  '(4)  =  [( i-py)^  +  2(PjP^  +P,  +  Px-  PJ(  ^-Pf) 

+  2p^,(p2<?3o)  'lPw^<72^'?3o'*^3i^- 

Defininon.  For  K-terminal  reliability,  we  describe  each  SE  on  a  path  from  a  specified 
input  s  to  an  output  in  /IT  as  marked. 

Theorem  3.  For  an  input  s  and  a  set  K  of  outputs  in  an  NxN  SENE,  where  k  =  |Ar|,  the  K- 
tenninal  reliability  can  be  computed  by  the  equation  below  in  0{k  log  N)  dme. 

KRiKM,s)  =  [ip,  +  p,  -  2p,p^q;i)KR,iK,N,s)  +  pMKIi^iK,N,s)]qlq^ql. 

Values  for  KR\iK,N .s)  and  KR^iKN.s)  can  be  computed  by  recurrence  expressions 
according  to  the  different  cases  enumerated  below.  The  base  case  for  N=4  can  be  computed  by 
considering  several  different  cases.  For  the  sake  of  brevity,  the  results  for  the  base  case  are  not 
enumerated. 

Case  /:  Only  one  child  of  considered  SEs  is  marked. 

1.  The  T  mode- in  these  SEs  allow  working  paths  from  input  stothek  outputs,  then 

XRi(KN,s)  and  KR2iK/<fj)  can  be  computed  by 

KRiiK,N,s)  =  (Pr 

KR2iK,N,s)=  (Pt+Pw)^^2(^.Y.4)+2(p,+p^)(P;,+Py  +  a-*Pw-w(^>Y.-r) 
•<72(<73o^3i)**- 

2.  The  X  mode  in  these  SEs  allow  working  paths  from  input  stothek  outputs,  then 

KRi(KN,s)  and  KRiiKNj)  can  be  computed  by  the  equadons  above,  exchanging  Pj 
and  p^ 

Case  II:  Both  children  of  considered  SEs  are  marked.  Let  and  Kr  be  the  subsets  of  K  that 
lie  in  the  left  and  right  subgraphs,  respeedvely. 

KRiiKNj)  can  be  computed  by 

p./Awi- 

KRiiKNrS)  can  be  computed  by 

1.  If  input  s  can  access  the  right  (left)  subgraph  if  considered  SEs  are  set  in  T  (X)  mode,  then 


) 


KR2iK,y,s)  = 


plq2Sf^R2{K„^,s]KR2{K„^,s' 


i  ; 
N 


+  2PtP^qiiKRi[  Ri,^,s  Jo.l  K„^.s 


2  ; 
N 


*  lPzP^^f:i>2[K('J.AKK\[Xr.J.S 


*i(p4p/  *  «'V.  - 1)?£,'  -  /f,,  J.il 

6  4  4 

•‘72'?3o^3t- 


2.  If  input  s  can  access  *e  left  (right)  subgraph  if  considered  SEs  are  set  in  T  (X)  mode,  then 
KR2  can  be  computed  by  the  equation  above,  exchanging  Pi  and  p^ 
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Summary  and  Conclusion.  Multistage  interconnection  networks  (MINs)  are  a  widely  studied 
means  of  interconnecting  processors  to  memory  or  processors  to  processors  by  stages  of  switches.  In 
this  paper,  we  have  presented  a  set  of  efficient  reliability  evaluation  algorithms  for  the  terminal 
reliability,  broadcast  reliability,  and  AT-tcrminal  reliability  problems  in  shuffle-exchange  network  with 
an  extra  stage  (SENE)  MIN  with  time  complexities  0(Ioglog  N),  OQog  A/)>  and  0(N  log  N)  time, 
respectively.  For  each  of  these  problems,  the  best  previous  approaches  took  time  exponential  in  N. 
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1.  Introduction 

To  achieve  faster  computing  speeds  imperative  for  many  computer  applications,  the  use  of 
multiple  processors  operating  in  parallel  is  necessary.  Consequently,  the  reliability  of  the  network 
interconnecting  these  processors  is  of  notable  importance,  as  is  the  ability  to  quickly  evaluate  whether 
the  network  can  implement  a  desired  set  of  connections.  The  problem  of  exact  reliability  evaluation, 
however,  is  computationally  injactable  for  most  reliability  measures  in  general  networks.  In 
particular,  the  problems  of  terminal  reliability,  broadcast  reliability,  and  AT-terminal  reliability 
evaluation  are  ^P-complete  (Ball,  1986).  /(-terminal  reliability  is  the  probability  that  a  path  exists  from 
one  specified  node  to  each  of  a  specified  set  K  of  nodes;  broadcast  (terminal)  reliability  is  a  special  case 
of  /(-terminal  reliability  in  which  K  is  the  set  of  all  (one)  output  nodes  (node).  For  some  restricted 
cases,  though,  a  network  offers  sufficient  structure  that  the  reliability  may  be  efficiently  evaluated 
(Colboum,  1987). 

Multistage  interconnection  networks  (MINs)  are  a  widely  studied  means  of  interconnecting 
processors  to  memory  or  processors  to  processors  by  stages  of  switches.  MINs  are  also  increasingly 
used  in  experimental  systems.  MINs  arc  an  integral  pan  of  the  design  of  such  large  scale  projects  as 
PUMPS,  CEDAR,  and  PASM  (Siegel,  1985).  Therefore,  the  problem  of  reliability  evaluation  of 
MINs  is  of  interest.  In  this  paper,  we  present  simple  and  efficient  algorithms  for  terminal,  broadcast, 
and  ^r-icrminal  reliability  evaluation  of  the  shuffle-exchange  network  with  an  extra  stage.  For  the 
reliability  problems,  we  utilize  a  stochastic  model  of  the  MIN  in  which  SEs  may  fail  with  a  known 
probability  and  links  are  always  working.  The  terminal  and  broadcast  reliability  evaluation  algorithms 
run  within  time  (/(loglog  AO  and  0(log  N)  for  a  network  of  size  A/,  respectively.  The  A'-tciminal 
reliability  evaluation  algorithm  runs  within  time  0(/V  log  N).  Varma  and  Raghavendra  (1989)  obtained 
similar  efficient  algorithms  for  terminal  reliability  and  broadcast  reliability  evaluation  of  the  Generalized 
INDRA  network.  Merged  Delta  network,  and  Augmented  C-nctwork. 


The  structure  of  the  paper  is  as  follows.  In  Section  2.  we  present  definitions  and  describe 
some  background  results  used  throughout  the  paper.  Section  3  describes  the  reliability  evaluation 
algorithms  and  analyses  their  complexity  issues. 

2.  Definitions  and  Background 

Multistage  interconnection  networks  essentially  comprise  switching  elements  (SEs)  and  links 
between  switching  elements.  MINs  may  contain  many  combinations  of  switch  sizes  (Feng,  1981). 
For  our  discussion,  we  restrict  ourselves  to  shuffle-exchange  MENs  built  from  2-input,  2-output  SEs. 
Such  a  shuffle-exchange  MIN  has  N  =  2"  inputs  and  outputs  and  n  stages,  with  each  stage  comprising 
N/2  switches.  The  stages  are  numbered  from  I  to  n.  The  outputs  of  SEs  in  stage  /  connect  to  the 
inputs  of  stage  i+l  by  a  shuffle  connection,  for  !</<«.  A  SENE  has  n-i-1  stages,  numbered  0  to  n. 
SEs  at  the  input,  stage  0,  are  labelcd/g,  /],  ...Jsfl-l-  SEs  at  the  output  stage  arc  labeled  Oq,  Oj,  ..., 
SEs  at  stage  i  are  labeled  (j-I)N/2,  {iA)NH  1, ....  iNl2  -  1,  for  1  ^  i  <  n.  For  the  sake  of 
discussion,  assume  that  the  MIN  connects  N  input  processors  to  N  output  processors. 

For  any  pair  of  input  and  output  processors,  the  SENE  possesses  exactly  two  paths  from  the 
input  to  the  output.  These  paths  share  an  SE  in  the  input  stage  and  an  SE  in  the  output  stage,  but  are 
otherwise  disjoint  We  will  designate  the  path  through  the  smaller  numbered  SEs  as  the  upper  path 
and  the  other  as  the  lower  path.  For  example,  the  upper  path  from  input  0  to  output  0  contains  SEs  Iq, 
0,  8,  16,  and  Oq,  while  the  lower  path  contains  SEs  /q,  1,  10,  20,  and  Oq  (Figure  1). 

In  a  SENE,  each  SE  in  stage  0,  the  input  stage,  is  connected  to  a  pair  of  SEs  in  stage  1 .  Each 
of  these  SEs  is  the  root  of  a  complete  binary  tree  of  SEs  of  height  n  whose  leaves  arc  the  SEs  of  stage 
n,  the  output  stage.  The  two  trees  are  disjoint,  except  that  the  leaves  of  the  trees  are  identical.  (See 
Figure  1.)  We  call  the  tree  (including  the  leaves)  rooted  at  the  smaller  numbered  switch  as  the  upper 
broadcast  tree  (BTjj)  and  the  tree  (including  the  leaves)  rooted  at  the  larger  numbered  switch  as  the 
lower  broadcast  tree  (BTl).  Omitting  the  input  stage  SEs,  the  set  of  upper  paths  from  any  input 
processor  forms  the  upper  broadcast  tree,  and  the  set  of  lower  paths  forms  the  lower  broadcast  tree. 


Wc  define  the  upper  network  as  the  set  of  upper  paths  from  each  input  to  each  output,  omjmng 
the  input  stage  SEs,  and  we  define  the  lower  network  as  the  set  of  lower  paths  from  each  input  to  each 
output,  omitting  the  input  stage  SEs.  Each  input  stage  SE  is  connected  to  one  SE  in  stage  1  of  the 
upper  network  and  one  SE  in  stage  1  of  the  lower  network.  Figtire  1  depicts  the  upper  and  lower 
networks  of  a  16x16  SENE,  noting  the  connections  of  the  input  stage  SEs  and  depicting  the  output 
stage  SEs  in  the  center.  These  very  regular  paths  from  an  input  to  the  outputs  offers  us  the  structure 
necessary  to  efficiently  solve  the  rclitd)ility  and  decision  problems. 

Note  from  Figure  1  that  the  upper  and  lower  networks  are  symmetrical.  For  a  given  SE  g  in 
stage  i  of  the  upper  network,  there  is  an  SE  g'in  the  lower  network  in  the  corresponding  position  in 
the  same  stage.  Let  Lig)  denote  this  SE.  For  example,  in  a  16x16  SENE,  L(0)  =  1  and  L(17)  =  21. 
For  SE  A  in  the  lower  network,  let  U(h)  denote  the  SE  in  the  corresponding  position  in  the  upper 
network,  so  £/(l)  =  0  and  U{2\)  =  17  in  our  example. 

Because  of  the  regular  connection  pattern  between  stages  of  SEs,  it  is  straightforward  to  list  the 
SEs  on  a  path  from  a  specified  input  to  a  specified  output  or,  to  determine  the  SEs  connected  to  a  given 
SE  j.  We  refer  to  an  SE  Jfc  in  a  stage  preceding  that  of  SE  j  such  that  a  path  exists  in  the  fault- free 
SENE  from  SE  A  to  SE  y  as  an  ancestor  of  SE  j.  Wc  refer  to  an  SE  A  in  a  stage  following  that  of  SE  j 
such  that  a  path  exists  in  the  fault-fipcc  SENE  fipom  SE  y  to  SE  it  as  a  descendant  of  SE  j. 

Wc  assume  that  the  processor  running  our  algorithm  is  a  Random  Access  Machine  that  can 
compute  addition,  subtraction,  multiplication,  division,  and  bitwise  Boolean  opcraticiis  in  one  unit  of 
time  (Trahan  et  ai,  1991). 

For  an  integer  s,  let  #j  denote  its  representation  in  binary. 

Lemma  2.1.  For  any  SE  g  in  an  NxN  SENE,  the  following  can  be  computed  ui  constant  time,  given 
a  table  T  that  can  be  generated  in  Oflog  N)  time: 

(i)  the  stage  i  in  which  SE  g  is  located, 

(ii)  the  pair  of  SEs  in  suge  i+l  to  which  the  outputs  of  SE  g  arc  connected, 

(iii)  the  pair  of  SEs  in  stage  i-l  connected  to  the  inputs  of  SE  g. 
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(iv)  for  an  SE  A  in  a  stage  preceding  stage  i,  whether  SE  /t  is  an  ancestor  of  SE  j, 

(v)  for  an  SE  /i  in  a  stage  following  stage  t,  whether  SE  /i  is  a  descendant  of  SE 

(vi)  whether  SE  g  is  in  the  upper  or  lower  network,  and 

(vii)  if  SE  g  is  in  the  upper  dower)  network,  the  SE  L(g)  iU(g))  in  the  corresponding  posinon  in  the 
lower  (upper)  network. 

Lemma  2.2.  For  any  given  input  Ij  and  output  0^^  in  an  NxN  SENE,  the  upper  and  lower  paths 
from  Ij  to  Ojfe  can  be  generated  in  0(log  S)  time. 

Terminal  reliability  (JR)  is  the  probability  that  at  least  one  path  exists  from  a  given  input 
processor  to  a  given  output  processor  of  the  network.  K-terminal  reliability  is  the  probability  that  at 
least  one  set  of  paths  exists  from  a  given  input  processor  to  each  processor  in  a  set  of  output 
processon  of  the  network.  Broadcast  reliability  (BR)  is  the  probability  that  at  least  one  set  of  paths 
ex’sts  from  a  given  input  processor  to  each  output  processor  of  the  network.  BR  is  a  special  case  of 
AT-tcrminal  reliability.  Network  reliability  is  the  probability  that  at  least  one  set  of  paths  exists  to 
connect  each  input  processor  to  each  output  processor. 

In  the  SENE,  each  input  processor  is  connected  to  the  input  of  a  single  SE  in  the  input  stage, 
and  each  output  processor  is  conneaed  to  the  output  of  a  single  SE  in  the  output  stage.  In  the 
foUowing,  we  describe  our  algorithms  based  on  the  input  (output)  stage  SE  to  which  an  input  (output) 
processor  is  connected,  rather  than  based  on  the  input  (output)  processor  itself. 

We  make  the  following  assurapdons  for  the  reliability  problems.  Each  SE  in  the  input  stage 
and  output  stage  is  always  working.  Each  SE  in  the  other  (intermediate)  stages  is  working  with 
probability  p  and  failed  with  probability  q=  \  -p.  Individual  SE  failure  probabilities  are  statistically 
independent.  Each  link  is  always  working. 

We  will  later  show  how  to  relax  the  assumption  that  the  failure  probability  of  each  SE  is 


identical. 


3.  Reliability  Evaluation  Algorithms 


3.1.  TERMINAL  REUABILITY 

Given  the  structure  of  the  SENE  mentioned  above,  the  TR  problem  is  easily  solved.  Let  be 
the  specified  input  and  be  the  specified  output.  An  NxN  SENE  contains  exactly  two  disjoint  paths, 
that  is,  the  upper  and  lower  paths,  each  of  length  n  =  log2N.  from  Ij  to  Oj^.  (Note:  All  loganthms  are 
taken  to  base  2  in  this  paper.)  The  graph  is  simply  series-parallel.  Let  TR(A0  denote  the  terminal 
reliability  of  an  NxN  SENE. 

TR(N)=  1  -  (1 

Theorem  3.1.  The  terminal  reliability  of  an  NxN  SENE  can  be  evaluated  in  GGoglog  N)  time. 

3.2.  BROADCAST  Reliability 

The  structure  of  the  SENE  allows  us  to  use  a  recursive  approach  to  evaluating  the  broadcast 
reliability  of  the  SENE.  For  the  BR  problem,  let  Ij  be  the  specified  input  SE,  let  A  be  the  root  of  BTjj, 
let  C  and  £  be  the  two  SEs  to  which  A  is  connected  in  stage  2,  let  B  be  the  root  of  BTl,  and  let  D  and 
F  be  the  two  SEs  to  which  B  is  connected  in  stage  2,  where  the  label  of  SE  C  is  less  than  that  of  E  and 
the  label  of  D  is  less  than  that  of  F.  (Sec  Figure  2.) 

Input  Ij  may  reach  all  output  by  paths  through  A  only,  through  B  only,  or  some  outputs 
through  A  and  the  rest  through  B.  We  handle  each  of  these  cases  separately.  Case  1  comprises 
instances  in  which  A  is  working  and  B  is  failed;  Case  2  comprises  instances  in  which  A  is  failed  and  B 
is  working;  and  Case  3  comprises  instances  in  which  both  A  and  B  arc  working.  Each  case  describes 
a  disjoint  collection  of  instances,  so  the  overall  reliability  will  be  the  sum  over  the  three  cases  of  the 
probability  that  a  set  of  paths  exists  from  Ij  to  each  output  in  each  case. 

Case  1.  SE  A  is  working  and  SE  B  is  failed.  No  paths  exist  from  Ij  to  any  inner  node  in  BTl, 
so  the  graph  of  nodes  reachable  from  Ij  comprises  Ij,  an  edge  from  Ij  to  A,  and  a  complete  binary  tree 
of  height  n  rooted  at  A.  Exactly  one  path  exists  from  Ij  to  each  output,  so  every  node  in  the  binary  tree 
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must  be  working.  This  tree  contains  Nfl  leaves  and  Nfl  -  1  inner  nodes.  The  reliability  of  this  case  is 
as  follows. 

Ra  =  V 

Case  2.  SE  A  is  failed  and  SE  B  is  working.  The  analysis  is  analogous  to  that  of  Case  1 . 

Case  3.  Both  SE  A  and  SE  B  are  working.  Working  paths  exist  from  Ij  to  SEs  C,D,  E,  and 
F.  Call  the  outputs  Oq  through  0;v/4-i  outputs,  and  call  the  outputs  0/^/4  through  Oi^f2.\ 

the  right  half  outputs.  Input  Ij  can  reach  the  left  half  outputs  through  SEs  C  and  D,  and  Ij  can  reach  the 
right  half  outputs  through  SEs  £  and  F.  Observe  that  the  probability  that  Ij  can  reach  all  the  left  half 
outputs  is  equal  to  the  probability  that  Ij  can  reach  ail  the  right  half  outputs.  The  probability  that  Ij 
can  reach  all  the  outputs  is  equal  to  P^P/i  =  (P^)  .  Evaluating  reduces  to  the  same  broadcast 
reliability  evaluadon  problem  in  a  netwOTk  with  half  the  number  of  outputs. 

RabW=P^(BR(/V/2))1 

Putting  the  three  cases  together,  we  obtain  the  following  recurrence  for  a  SENE  with  N 
outputs. 

BR(N)  =  +  p^(BR(N/2))^. 

2 

The  base  case  for  the  recurrence  is  BR(2)  =  2p<f  +  p  . 

Theorem  3J1.  The  broadcast  reliability  of  an  MxN  SENE  can  be  evaluated  in  0(log  AO  dme. 

Proof.  We  precompute  p”  for  /=  1,  2,  ....  log  A/  in  rime  0(log  N).  We  then  evaluate  the 
recurrence  equation  in  a  constant  amount  of  time  for  each  of  log  A/  levels  of  recursion  and,  so  evaluate 
the  broadcast  reliability  of  an  MxM  SENE  in  rime  OGog  AO.  I 

Time  OOog  AO  to  evaluate  BR  is  far  bener  than  the  rime  complexity  of  previous  algorithms 
using  the  sum  of  disjoint  products  method  and  running  in  rime  exponential  in  N  (Botring  ei  al,  1989; 


Rai  and  Trahan,  1989;  Kulkami  and  Trahan,  1991).  Theorem  3.2  also  establishes  that  the  problem  of 
evaluating  the  broadcast  reliability  for  a  SENE  is  not  #P-complete,  as  is  the  case  for  a  general  network. 

The  recurrence  obtained  for  BR  evaiuadon  is  very  similar  to  recurrences  generated  by  Varma 
and  Raghavendra  (1989)  for  BR  evaluation  of  other  MINs.  Their  redundancy  graphs  for  the 
Generalized  Indra  Network  and  Augmented  C  Network  are  very  similar  to  the  broadcast  tree  structure 
shown  in  Figure  2  for  the  SENE.  Their  recurrence  for  BR  on  the  Augmented  C  Network  is  almost 
identical  to  that  for  the  SENE. 

3.3.  DIFFERENT  Switch  RELIABILITIES 

We  now  extend  the  solution  method  to  two  related  problems.  The  first  is  the  BR  problem  for 
the  SENE  if  SEs  are  allowed  different  probabilities  of  working,  and  the  second  is  the  AT- terminal 
reliability  problem.  The  Af-terminal  reliability  evaluation  algorithm  will  use  the  BR  evaluation 
algorithm  for  different  switch  reliabilities  as  a  building  block. 

Suppose  that  the  reliabilities  of  individual  SEs  may  differ.  Let  /?,•  denote  the  probability  the  SE 
i  is  working.  We  will  follow  the  previous  decomposition  approach  with  the  same  cases  resulting,  but 
will  evaluate  the  contribution  of  each  case  to  the  total  reliability  differently.  The  reliability  function  BR' 
now  has  two  arguments,  C,  the  graph  and  associated  reliabilities,  and  N,  the  size  of  the  SENE.  Let 
Gi  denote  the  decomposed  graph  containing  the  left  half  outputs  and  associated  reliabilities,  and  let  C/j 
denote  the  decomposed  graph  containing  the  right  half  outputs  and  associated  reliabilities. 

The  complete  recurrence,  hence  recursive  algorithm,  for  this  situation  is  as  follows. 

BR'(G,N)  =  ‘?«  Hp^  +paPb  BR'(Ci.,iV/2)  BR'(G/f,W2). 

The  base  case  for  the  recunence  is  BR'(G,  2)  =  ^  QaPb  ^aPb- 

Theorem  323.  The  broadcast  reliability  of  an  NxN  SENE  in  which  each  switch  may  have  a  different 
reliability  can  be  evaluated  in  OiN  log  N)  time. 
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3.4.  a:-terminal  RELIABILITY  Evaluation 

Af-terminal  reliability  is  the  probability  that  at  least  one  set  of  paths  exists  from  a  given  input 
processor  to  each  output  processor  in  a  set  AT  of  size  k.  BR  is  a  special  case  of  ^-terminal  reliability  in 
which  set  K  comprises  the  set  of  all  output  processors;  TR  is  a  special  case  of  /Sr-teiminal  reliability  in 
which  set  K  comprises  a  single  element.  Given  an  iVxV  SENE,  a  specified  input  Ij,  and  a  set  K  ~ 
{O^,  where  lAH  =  i,  we  wish  to  evaluate  the  probability  that  /j  can  reach  each  output 

e  /sT.  We  present  two  algorithms  for  computing  AT-tenninaJ  reliability,  the  first  for  instances  in  which 
i  >  log  N,  and  the  second  for  instances  in  which  Jt  <  log  N.  For  these  algorithms,  the  AT-terminal 
reliability  is  computed  the  same  way  whether  switch  reliabilities  are  the  same  or  different  so  for  clarity 
and  generality,  we  describe  the  situation  in  which  these  reliabilities  are  different 

Both  algorithms  start  with  the  same  initialization  procedure  as  follows.  Set  up  an  initially 
empty  array  5,  where  element  B(g)  corresponds  to  SE  g.  Array  B  contains  0(V  log  AO  elements.  For 
each  K  and  each  SE  g  in  the  upper  path  or  in  the  lower  path  from  Ij  to  0„,  mark  clement  B(g). 
Each  path  can  be  computed  in  OOog  N)  time  by  Lemma  2.2,  and  there  are  2k  paths  to  trace,  so  this 
initialization  takes  0(k  log  N)  time. 

Algorithm  Kl:  k>  log  N. 

To  evaluate  the  /f-terminal  reliability,  execute  the  algorithm  above  for  the  BR  problem  with 
different  switch  reliabilities,  making  the  following  modification.  If  SE  s  is  marked,  then  leave  its 
reliability  asp^;  if  SE  s  is  not  marked,  then  treat  its  reliability  as  1.  This  modification  treats  the  parts  of 
the  broadcast  trees  from  input  /y  that  reach  only  outputs  not  in  AT  as  being  completely  reliable,  so  the 
result  returned  by  this  algorithm  is  exactly  the  /f-terminal  reliability.  The  time  complexity  of  this 
algorithm  is  <9(A/  log  A/),  as  for  the  different  switch  reliabilities  problem. 

Algorithm  K2:  k  ^  log  N. 

If  an  SE  g  is  not  marked,  then  no  path  from  Ij  to  any  output  in  the  set  K  contains  SE  g.  Note 
further  that  no  path  firom  Ij  to  any  output  in  the  set  K  contains  any  descendant  of  SE  g  because  of  the 
tree  structure  of  the  broadcast  paths.  Therefore,  unmarked  SEs  and  their  successors  will  be  handled  as 
whole  subtrees  without  further  recursion. 
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The  same  decomposition  approach  will  again  be  followed,  but  the  contribution  of  each  case  to 
the  total  reliability  will  be  evaluated  differently  depending  on  whether  an  SE  is  marked  or  not.  The 
reliability  function  KR  has  two  arguments,  G,  the  graph  and  associated  reliabilities,  and  N,  the  size  of 
the  SENE.  For  the  specified  switches  SE  A  and  SE  B,  either  both  are  marked  or  both  arc  unmarked 
because  they  share  the  same  set  of  outputs  that  are  descendants  in  stage  n.  Let  BTy^  denote 

the  set  of  marked  SEs  in  BT^j  (BTl). 

ITie  recurrence,  and  hence  recursive  Algorithm  K2,  is  specified  below. 

KR{G,  N)  =  Y\Ps  KR{Gi,  Nil)  KR{Gr,  Nil),  if  both  and 

B  are  marked; 

KR(G,  AO  =  0,  if  both  A  and  B  are  unmarked. 

The  base  cases  for  the  recurrence  is  KR(G,  2)  -Pa<Ib  **■  ^aPb  ^  ^  ^  ^ 

marked;  KR(G,  2)  =  0,  if  both  A  and  B  are  unmarked. 

The  lime  to  evaluate  KRiG,  N)  is  the  sum  of  the  time  to  evaluate  each  of  the  three  terms.  The 
first  two  terms  may  be  evaluated  in  0(k  log  N)  time.  The  time  to  evaluate  the  third  term  is  the  sum  of 
the  times  to  evaluate  two  KR  functions  for  graphs  with  Nil  outputs.  Thus,  the  overall  time  to  evaluate 
KRiG,  N)  is  as  follows. 

log  N  +  TTY^iNU),  where  c  is  a  constant 
logyv 

=  cit£2‘(logN-(i-l)) 

«=i 

=  OikN). 

This  time  measure  is  an  overestimate,  as  it  docs  not  account  for  the  fact  that  the  size  k  of  the  set  of 
outputs  of  interest  decreases  at  lower  levels  of  recursion.  (The  amount  of  decrease  at  each  level  of 
recursion  depends  on  the  exact  set  of  elements  in  K.)  Since  the  initialization  time  is  0(k  log  N),  the 
overall  time  to  execute  Algorithm  K2  to  compute  /T-tciminal  reliability  for  a  SENE  of  size  N  is  O(.'dV). 
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Theorem  3.4,  The  ^-temiinal  reliability  for  a  SENE  can  be  computed  in  0(N  ■  min{;t,  log  S}) 
time. 
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Figure  2.  Upper  and  lower  networks  for  a  16  x  16  SENE 


